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Two general methods are employed to determine a student’s 
marks or grades on the basis of test scores. One of these is the 
percentage method and the other the class or group ability method. 
he percentage method assumes that the tests employed are of such a 
degree of difficulty that the student should be able to do correctly 
rom seventy to seventy-five per cent of the questions to receive the 
owest passing grade and higher per cents to obtain higher grades. 
By this method a certain percentage of a perfect performance on the 
est is required to make a given grade. When this method is employed 
2 student’s score and his grade are identical. 

The percentage method might be satisfactory if teachers were 
able to prepare tests or series of tests of equal difficulty. The more | 
he teachers fail in doing this the more unsatisfactory the method [ 
becomes. That teachers do not succeed in making their tests of equal 
difficulty became evident from an examination of each of thirty 1% 
series of tests employed by teachers in thirty different classes as a t | 
basis for determining the students’ grades. In these thirty classes 4 
he lowest percentage score was fifty-five and the highest eighty-eight. el 
he percentage score was found by dividing the average score for | 
pach class by the total number of test items administered to the class. 

One of these classes with an enrollment of sixty-five students and an 
average intelligence test score of fifty-three made a percentage score of 
ifty-six, whereas another class with the same intelligence score and an 
enrollment of seventy-seven students made a percentage score of 
eighty-eight. The total number of test items for the former class was 
bne hundred ninety-eight and for the latter one hundred fifty-six. A 
hird class with an enrollment of sixty-two students and an intelligence 
est score of thirty-nine answered correctly on the average, eighty-four 
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per cent of three hundred items, whereas a fourth class with an enroll- 
ment of thirty-four students and an intelligence test score of sixty-five 
answered correctly only sixty-nine per cent of two hundred sixty-one 
items. These results show how widely the tests of college teachers 
differ in difficulty. A fairly good student in the fourth class might 
make a poor grade, but in the third class he might make a good grade. 

By the class ability method those students whose achievement in 
respect to the entire class is average receive an average grade, whereas 
those whose achievement is relatively high receive high grades and 
those whose achievement is relatively low receive low grades. This 
method allows for the fact that teachers’ tests vary in difficulty, but 
it does not allow for the variation in ability from class to class. In 
the thirty freshmen classes already referred to, the intelligence test 
scores varied from thirty-three for a class of forty students to sixty-five 
for a class of thirty-four students. There are freshman classes with 
an enrollment of fifty-two whose intelligence test scores fall as low as 
twenty-eight, and there are graduate classes with an enrollment of 
ten students whose average intelligence test scores are as high as 
eighty-six. A good student in such classes as the latter might make a 
failing grade and in such classes as the former might make a high grade 
by the class ability method. In all of these classes the intelligence was 
determined by using the American Council Psychological Examination. 

From the results given it appears that both the percentage method 
and the group ability method are astonishingly inadequate for the 
purpose of correct measurement. Such methods should be discarded 
for others which are known to be more satisfactory. It is the purpose 
of this article to give evidence of a more satisfactory procedure in the 
assignment of grades on the basis of scores obtained from objective 
tests. 


THE DATA 


During the winter quarter of 1931 and again during the spring 
quarter of the same year, all teachers of classes in which freshmen 
students were enrolled were requested to hand in all of the grades 
for these classes and the scores on which the grades were based. At 
the close of the winter quarter sixteen teachers turned in the scores 
and grades for 1517 students enrolled in twenty classes. At the end 
of the spring quarter nineteen teachers turned in the scores and grades 
for 1704 students enrolled in thirty-four classes. For the twenty 
classes of the winter quarter the average enrollment was seventy-six, 
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and for the thirty-four classes of the spring quarter it was fifty. The 
smallest number of students in any class of the winter quarter was 
thirty-nine and the largest number one hundred fifty-four. Corre- 
sponding numbers for the classes of the spring quarter are ten and 
one hundred eighteen. For the spring quarter nine of the c asses had 
an enrollment under thirty. 

The total number of students for the winter quarter and for the 
spring quarter as given in the preceding paragraph does not mean so 
many different students. (Therefore the number of grades is the 
same as the number of students.) The number of classes in which a 
single student appeared varied from one to four. Moreover not all 
of these students were freshmen, because there were a few classes in 
which there was a mixture of freshmen and sophomores. For some of 
the computations of this investigation all of the grades for all of the 
students were used, but for the computation of coefficients of correla- 
tion the grades used were only for those freshmen who were enrolled 
in both the winter and spring quarter classes for which data were 
received. This elimination of grades and students, reduced the num- 
ber of grades for the winter quarter from 1517 to 1008 and for the 
spring quarter from 1704 to 988. These are the grades for four 
hundred seven different freshmen who were enrolled in one or more 
classes of each quarter. For each freshman there are for each quarter 
approximately 2.5 grades, although the number of grades for individual 
students varies from one to four. By far the majority of our freshmen 
receive four grades per quarter. Our coefficients of correlation would 
have been higher if all of the grades for each of the four hundred seven 
freshmen could have been obtained for each quarter. 


THE GRADING SYSTEM AND METHODS OF TRANSLATING SCORES INTO 
GRADES 


The letters A, B, C, D, and F are used by our college to represent 
the different values in a five-point grading system. F represents a 
failing grade, A the highest grade and C an average grade. 

The teachers of the college employ different methods of translating 
scores into grades. According to their own statements those who 
turned in grades for this investigation employed in the main two 
different methods, but their distributions showed that they did not 
adhere very closely to their methods. For one of the methods the 
teachers used standard distributions, but these varied from teacher to 
teacher. In these distributions the A’s and F’s varied from five to 
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ten per cent; the C’s from thirty-five to fifty per cent; and the B’s and 
D’s from twenty to twenty-five per cent. 

In the second method the class median was taken as the mid-value 
of a C and a distance on the scale of scores of 1.6 Q was selected as the 
range for each grade. As this method makes use of 1.6 Q and the 
median, it will be referred to hereafter as the 1.6 Q-Md method. Ina 
normal distribution the 1.6 Q-Md method would yield the following 
per cents for each of the five grades: 5.5, 24, 41, 24, 5.5. It may be 
stated that most of the teachers who said that they used this method 
did not adhere to it very rigidly as was discovered by re-computing 
their grades on the basis of the same scores used by them. 

A modification of the second method allows for variations in class 
intelligence. For example, if the median of the class on the intelligence 
scale is one-half Q above that for college students in general, the point 
in the distribution of test-scores for the mid-value of C is selected one- 
half Q on the test-score distribution below the median of the class. 
This method not only allows for variations in class intelligence, but it 
also employs a more or less fixed value, the intelligence test score for 
college students in general, as a starting point for making its measure- 
ments. As this method makes use of 1.6 Q and a median corrected 
on the basis of class intelligence, it will be referred to hereafter as the 
1.6 Q-CMd method. This study was made to compare the 1.6 Q-Md 
method, rigidly applied, and the 1.6 Q-CMd method with the teachers 
methods of translating scores into marks. 


RESULTS 


A comparison is made of the grades obtained by three different 
methods: The combined methods of all of the teachers; the 1.6 Q-Md 
method; and the 1.6 Q-CMd method. First, the distributions of the 
total number of grades obtained from the test scores of all of the 
classes by the three different methods are compared. The comparisons 
are made for both the winter and the spring quarter grades. The 
distributions are given in Table I. 

The distribution of marks obtained by the teachers’ methods for 
the winter quarter is quite badly skewed. There are more A’s than 
F’s and more B’sthan D’s. Eight per cent more of the marks lie above 
than below C. This condition is not supported by the distribution of 
marks obtained by the 1.6 Q-Md method nor by the distribution of 
the scores made by our students on standardized tests. In the 
distribution for the 1.6 Q-Md method twenty-nine per cent of the 
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Taste I.—Tue DistrrisvuTion or Grapes aS DETERMINED BY THE TEACHERS, 
THE 1.6 Q-Mp Mernop, AnD THE 1.6 Q-CMp Meruop 








Teachers’ methods 
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Marks 
Number| Per cent | Number} Per cent | Number! Per cent 
Winter Quarter 
A 94 6.20 56 3.69 46 3.03 
B 366 24.13 391 25.77 335 22.08 
C 720 47.46 643 42.39 606 39.94 
D 269 17.73 331 21.82 399 26 .30 
F 68 4.48 96 6.33 131 8.64 
se pene tek 1517 100.00 1517 100.00 1517 99.99 
Spring Quarter 
A 100 5.87 78 4.58 70 4.11 
B 388 22.77 405 23.76 329 19.31 
C 784 46.01 715 41.96 679 39.85 
D 367 21.54 403 23 .65 478 28.05 
F 65 3.81 103 6.04 148 8.69 
ae ee 1704 100.00 1704 99.99 1704 100.01 























marks lie above C and twenty-eight per cent of them lie below C. 
Neither does the distribution for the teachers’ methods conform to 
any distribution or any combination of distributions which the 
teachers stated that they followed. For example, none of the teachers 
stated that they followed distributions with as few F’s and D’s as are 
found in this distribution. Moreover, only one of the sixteen teachers 
pretended to give more than forty per cent of C’s, but the percentage 
of C’s is over forty-seven. The one teacher who stated that he aimed 
to give fifty per cent of C’s, actually gave forty-four per cent of C’s 
to a relatively small class. Nine of the sixteen teachers stated that 
they used the 1.6 Q-Md method and that they followed it quite 
closely. The distribution obtained by the rigid application of the 
1.6 Q-Md method conforms quite closely to the distribution to be 
expected by its use. Only the percentage of D’s departs as much as 
2.18 from the expected percentage. 

The distribution of the marks obtained by the teachers’ methods for 
the spring quarter is much better balanced, but the tendency to grade 
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too high is again present. The teachers’ distribution for the spring 
quarter is better than that for the winter quarter because more of the 
teachers, eleven out of nineteen, stated that they followed the 1.6 
Q-Md method and actually did follow it rather closely as indicated by 
their individual distributions. The distribution obtained by the use 
of the 1.6 Q-Md method conforms almost perfectly to the standard 
curve for this method. In no case do the percentages depart as much 
as one point from the expected percentages. 

Just how closely the distributions of the teachers who claimed the 
use of the 1.6 Q-Md method and those who used other methods did 
conform to the distributions obtained by the rigid application of the 
1.6 Q-Md method is shown in Table II. In this table a statement is 
made of the number of the teachers’ marks which were changed by the 
rigid application of the 1.6 Q-Md method. 


TasLe I].—Tue NuMBER AND PERCENTAGE OF THE TEACHERS’ WINTER AND 
SprinG QuaRTER Marks Wuicu WERE CHANGED BY THE RIGIp APPLICATION 
OF THE 1.6 Q-Mp MeErxop IN TRANSLATING SCORES INTO GRADES 



































Number of: Number marks | poe cent 
changed by rigid of tactie 
s f 1.6 Q-Md 
— Classes| Marks _ ys ait “ changed 
Winter Quarter 
1.6 Q-Md Meth. by teach- 
inetd i ee lia 9 13 999 220 22 
Other methods........... 7 7 518 172 33 
Spring Quarter 
1.6 Q-Md Meth. by teach- | 
NS clini sae Wale ed dd 11 21 1167 131 11 
Other methods........... 8 | 13 537 178 33 








The nine teachers who stated that they used the 1.6 Q-Md method 
in the winter quarter had twenty-two per cent of their marks changed 
by the rigid application of this method, whereas the remaining seven 
teachers who employed other methods had thirty-three per cent of 
their marks changed. For the spring quarter the percentage of 
changes in marks was only eleven for those who employed the 1.6 
Q-Md method, but for those who used the other methods it was again 
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thirty-three. These facts make it clear why the distribution of the 
teachers’ spring quarter marks conforms so much more closely than 
their winter quarter distribution to that obtained by the 1.6 Q-Md 
method. 

The distribution of marks obtained by the use of the 1.6 Q-CMd 
method is skewed in a direction opposite to that obtained by the 
teachers’ methods. For the winter quarter twenty-five per cent of the 
marks lie above C whereas thirty-five per cent of them lie below C. 
Corresponding figures for the spring quarter distribution are twenty- 
three and thirty-seven respectively. The relatively large number of 
low grades is explained by the fact that the median had to be raised 
for sixty-five per cent of the classes and lowered for only thirty-five 
per cent. It will be remembered that in computing grades by this 
method the marks of all classes with average intelligence test scores 
below fifty were lowered whereas the marks of those with intelligence 
test scores above fifty were raised. ‘There are more low grades for the 
spring than for the winter quarter, because the percentage of students 
in the classes with intelligence test scores below fifty was larger for the 
spring than for the winter quarter. There should of course be more 
low than high grades in our distributions if they are to be made com- 
parable to those earned by the average college student. This general 
lowering of grades could have been avoided if an intelligence test score 
of forty-five had been substituted for that of fifty. 

There is evidence in the distributions of Table I that the 1.6 Q-Md 
and the 1.6 Q-CMd methods give more consistent results than the 
teachers’ methods. This will appear if the sum of the differences in the 
percentages of the winter and spring quarters for one of the three 
methods is compared with a similar sum for each of the other methods. 
This sum is 7.64 for the teachers’ methods, 5.44 for the 1.6 Q-Md 
method and 5.74 for the 1.6 Q-CMd method. 

Comparing the three methods on the basis of these general dis- 
tributions of marks is an inadequate procedure. It does not give the 
results for individual classes and obscures the facts, as positive and 
negative errors tend to offset each other. I shall, therefore, present 
distributions for a few individual classes and the number of marks 
which were changed in each class as the result of replacing the teachers’ 
methods by each of the others. In Table III are given the distribu- 
tions of marks computed by each of the three methods for each of two 
classes which were taught by different teachers but whose marks were 
based on the same tests. 
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TaB.e III.—DistrrisutTions or Marks CoMPuTEeD BY THE TEACHERS’ METHODS, 
THE 1.6 Q-Mp MerTuHop, AND THE 1.6 Q-CMp MeEruHop on THE Basis OF 
Scores Maps On THE SaME SERIES OF TrEsts BY Eacu or Two CLasses 
Tavaut BY DIFFERENT TEACHERS, AND THE NUMBER OF MARKS 


CHANGED BY REPLACING THE TEACHERS’ Metuops By Eacu 
OF THE OTHER METHODS 
























































Class 1 
Average intelligence...52 Median test score.199.77 Q.................. 9.33 
Teacher’s method | 1.6 Q-Md method |1.6 Q-CMd method 
Marks 
Number | Per cent | Number| Per cent | Number! Per cent 
A 3 3.70 4 4.94 3 6.17 
B 19 23.46 27 33 .33 28 34.57 
C 51 62.96 29 35.80 27 33 .33 
D 8 9.88 15 18.52 15 18.51 
F 0 0.00 6 7.41 6 7.41 
RY Re 81 100 .00 81 100 .00 81 99.99 
hd och emen eae ca cues 13 16.05 15 18.52 
Nc aise swe ec aeaaew esas 48 59.26 48 59.26 
EE Se a eee 20 24.69 18 22 .22 
IG Daca 48k ala sae oe hace ee ees 71 100 .00 71 100 .00 
Class 2 
Average intelligence...49 Median test score.190.68 Q.................. 8.85 





























Teacher’s method | 1.6 Q-Md method |1.6 Q-CMd method 
Marks 
Number | Per cent | Number} Per cent | Number} Per cent 
A 1 1.30 2 2.60 2 2.60 
B 21 27.27 i8 23.38 18 23 .38 
c 43 55.84 35 42.86 35 42.86 
D ll 14.29 19 24.68 19 24.68 
F 1 1.30 5 6.49 5 6.49 
ees me “S Sitecd 77 100.00 77 100.01 
Ee a ee ee ee Se 5 6.49 5 6.49 
Er ere 51 66.23 51 66 .23 
RRS FR aa ee pe err 20 27.27 20 27.27 
ER ee eee ree ree 77 99.99 77 99.99 
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Perhaps the best way to compare the distributions obtained by 
each of the three methods is to determine how nearly they conform to 
the distributions which were adopted as standard. Because the 
average intelligence of each of these two classes differed only very 
slightly from fifty, there is practically no difference between the 
distributions for the 1.6 Q-Md and the 1.6 Q-CMd methods. There- 
fore, the teachers’ distributions will be compared only with those 
obtained by the use of the 1.6 Q-Md method. It will be remembered 
that the standard distribution for the 1.6 Q-Md method is: 5.5, 24, 
41, 24, 5.5. Each of these two teachers accepted 7, 24, 38, 24, 7 as 
their standard. To determine how nearly the actual distributions 
agree with the standard distributions, the sum of the differences 
between the percentages for each mark were computed. Signs were 
disregarded in finding this sum. The sum of these differences is 
forty-nine for class one, and forty-two for class two. Corresponding 
percentages for the 1.6 Q-Md method are twenty-three and six, 
respectively. While the distribution for class one should be skewed 
somewhat, the teacher undoubtedly disturbed the balance between 
low and high grades far too much, at least this lack of balance is far 
greater for the teacher’s distribution than for the other distributions. 
The teacher’s distribution for class two also has too large a number 
of high grades, because the distribution for the 1.6 Q-Md method is 
practically normal and the average intelligence of the class is slightly 
below average. Another fault of the teachers’ distributions is that the 
percentage of C’s for class one is 63 and for class two it is 56. 

In class one forty-one per cent of the grades would be changed, 
and in class two thirty-four per cent of the grades would be changed, 
if the teacher’s method were replaced by the 1.6 Q-Md method to 
translate scores into grades. 

I shall give the distributions for two more classes. These classes 
were taught by different teachers in different courses. The average 
intelligence test score of one of these classes is fifty-five and of the 
other it is only twenty-eight. The changes which would result in the 
teachers’ marks by the use of each of the other methods are also given. 

The average intelligence test score for class one is fifty-five and for 
class two it is only 27.64. In spite of the fact that class one ranks very 
much higher in intelligence only thirteen per cent of its marks are 
above C, whereas thirty-seven per cent of the marks of class two are 
above C, a difference of twenty-four per cent. A similar difference 
between the marks below C for these two classes amounts to only 
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TaBLeE IV.—DistripuTions oF Marks CoMPuTED BY THE TEACHERS’ METHOps, 
THE 1.6 Q-Mp MeErTHOD, AND THE 1.6 Q-CMp MErTuHop ON THE Basis OF 
Scores Mape sy Eacu or Two Dirrerent Cuiasses TAUGHT BY 
DIFFERENT TEACHERS, AND THE NUMBER OF MarkKS CHANGED BY 
REPLACING THE TEACHERS’ Metuops By EACH OF THE OTHER 





















































METHODS 
Class 1 
Teacher’s method | 1.6 Q-Md method |1.6 Q-CMd method 
Marks 

Number | Per cent | Number} Per cent | Number} Per cent 

A 0 0.00 1 2.56 2 5.13 

B 5 12.82 9 23.08 11 28 .21 

C 27 69.23 17 43.59 16 41.02 

D 5 12.82 9 23.08 7 17.95 

PF 2 5.13 3 7.69 3 7.69 
RPE eee 39 100.00 39 100.00 39 100 .00 
i de al ca ae oe 7 17.95 11 28.21 
env kanekeeengedeee 27 69.23 24 61.54 
CE eee eee 5 12.82 4 10.26 
i Ee gil al hee oie L 39 100.00 39 100 .00 

Class 2 

A 5 9.63 3 5.77 0 0.00 

B 14 26.92 14 26.92 11 21.15 

C 19 36.54 24 46.15 13 25.00 

D 12 23 .08 9 17.31 21 40.39 

F 2 3.85 2 3.85 7 13.46 

0 ee 52 100 .00 52 100 .00 52 100 .00 
EEE POET FCT TOPO Lee 4 7.69 00 0.00 
Diaries wnchama@ed.........ccccccccseces 41 78.85 20 38 . 46 
is pe dag baa ke wee 7 13.46 32 61.54 
Ra do ae oo oat wale Ga anal i 52 100.00 52 100.00 




















nine per cent. Class one has thirty-three per cent more C’s than class 
two. 

When the 1.6 Q-Md method is used the distribution of the marks 
of class one conforms almost perfectly to the standard distribution 
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for this method. When the 1.6 Q-CMd method is employed the 
distribution has many more high than low marks, as it should have in 
recognition of its high average intelligence. The application of the 
1.6 Q-Md method to the scores of class two yields too many high in 
comparison with the low marks. When the scores of this class are 
translated into marks by the use of the 1.6 Q-CMd method, there 
are more than twice as many low as high grades. This at least con- 
forms more nearly to what we should expect than any of the other 
distributions. 

When the teacher’s method of translating scores into grades is 
replaced by one of the other methods, the change in grades for class 
one amounts to about forty per cent. Replacing the teacher’s method 
for class two by the 1.6 Q-Md method, changes twenty-one per cent 
of the grades; replacing it by the 1.6 Q-CMd method changes sixty- 
two per cent of the grades. 

From the results which have just been given on the distributions of 
individual classes, the inferences appear to be justified that students in 
poor classes tend to be graded too high and that students enrolled in 
good classes tend to be graded too low. If a group of students should 
be enrolled in only good classes during one quarter and in only 
poor classes during another quarter, the relative standing of these 
students would probably not be markedly disturbed. But if, as is 
more likely, some of our students should be in poor classes during one 
quarter and in good classes during another, and some of the students 
should be in poor classes during both quarters the relative standing of 
these students in the first quarter would depart widely from their 
relative standing in the second quarter. This change in relative stand- 
ing would be measurable by means of coefficients of correlation. Now 
the 1.6 Q-CMd method of translating scores into grades is designed to 
avoid this change in relative standing on account of differences in 
enrollment in classes of varying abilities. Whether it does so more 
than the other two methods was determined by computing coefficients 
of correlation between the average grades of the winter quarter and 
those of the spring quarter. Three sets of average grades were avail- 
able for each quarter—those computed by each of the three methods 
used to convert scores into grades. 

In order to compute these coefficients not all of our grades could 
be used. In the first place all sophomore grades were eliminated 
because relatively few of them were enrolled in one or more of our 
classes for each quarter. Inthe second place many freshmen had to be 
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eliminated for the same reason. As already explained the grades of 
four hundred seven different freshmen could be used for our computa- 
tions. For the winter quarter 1008 grades and for the spring quarter 
nine hundred eighty-eight grades were available for these freshmen. 
This represents an average of about 2.5 grades for each student per 
quarter. The number of grades available for individual students 
varied from one to four for each quarter. The grades of each student 
were converted into numerical values by the point-hour ratio method. 
This conversion was made for the grades translated by each of the three 
methods and for each of the two quarters. The coefficients were 
computed by means of the product moment method. 

Coefficients of correlation were computed between the winter and 
the spring quarter grades for the teachers’ methods of determining 
them, for the 1.6 Q-Md method, and for the 1.6 Q-CMd method. The 
coefficient of correlation for the teachers’ grades is .493 + .025 PE. 
When all of the grades for each student are available, this coefficient 
as determined for other classes is about .60. The coefficient for the 
1.6 Q-Md grades is .489 + .025 PE, and for the 1.6 Q-CMd grades it is 
.609 + .022 PE. The teachers’ grades are just as consistent from 
quarter to quarter as are the 1.6 Q-Md grades, but there is a fairly large 
and reliable difference between the coefficients of reliability for the 
teachers’ grades and the 1.6 Q-CMd grades. The difference between 
these two coefficients is .116 + .026 PE. This difference is more than 
4.08 times the probable error of the difference. We may, therefore, be 
practically certain that the true difference between these coefficients 
lies above zero. The probable error of the difference between these 
two coefficients was computed by means of the following formula: 


PE,,,1,, = -6745+/SE?,,, + SE*,,, — 2rryr35Er,,5Er,, 





All methods of translating scores into grades should allow for 
differences in class or group ability, because it has been shown that 
the grades for which such an allowance has been made have an appre- 
ciably higher coefficient of reliability than the grades for which such 
an allowance has not been made. However, no claim is made that the 
method used in this investigation for the purpose of making such an 
allowance is the best that might be used for the purpose. 

Coefficients of correlation were also computed for intelligence test 
scores made on the American Council Psychological Examination and 
each set of marks obtained by the three methods employed for the 
purpose of translating scores into marks. These coefficients are given 
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in Table V. All of them would have been raised if eta in place of ‘“‘r’’ 
had been computed, because the relations are slightly but reliably of 
the curved line type. 


TaBLE V.—CoEFFICIENTS OF CORRELATION OBTAINED FOR INTELLIGENCE TEST 
ScoRES AND Eacu Set or Grapgs Founp sy Eacu or THe THREE METHODS 
EMPLOYED FOR THE PURPOSE OF TRANSLATING Scores INTO GRADES 
































Coefficients for the: 
Average marks 
4 ; for both quarters 

Winter quarter Spring quarter 

Intelli- PE Intelli- PE Intelli- PE 

gence gence gence 
Teachers’ methods. .... . 239 .032 . 299 .031 .309 .030 
1.6 Q-Md method...... . 264 .031 . 285 .031 .320 .030 
1.6 Q-CMd method....| .339 .030 .348 929 .380 .029 





For the winter quarter the difference between the coefficient for 
intelligence and the teachers’ grades and the coefficient for intelligence 
and the 1.6 Q-CMd grades is .10. The probable error of this difference 
is 023. As this difference of .10 is somewhat more than 4.08 times the 
probable error of the difference we can be practically certain that the 
true difference between these coefficients lies above zero. The dif- 
ference between the coefficient for the teachers’ grades and intelligence 
and the coefficient for the 1.6 Q-Md grades and intelligence is .025, but 
this is not large enough to be reliable. 

The differences between the coefficients obtained for the spring 
quarter data are not large enough to meet the requirements for relia- 
bility. However, the difference between the coefficient for intelligence 
and the 1.6 Q-CMd grades and that for intelligence and the teachers’ 
grades might have been reliable had not over one-half of the teachers 
during the spring quarter used and adhered closely to the 1.6 Q-Md 
method in translating the scores into grades. Coefficients were also 
computed for intelligence and the average of the winter and spring 
quarter marks. The differences between these coefficients also are 
not large enough to be reliable. The largest difference is .071. This is 
the difference between the coefficient for intelligence and the average 
marks obtained by the teachers’ methods and the coefficient for intel- 
ligence and the average marks obtained by the 1.6 Q-CMd method. 
As the probable error of this difference is .022, the difference is not 
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large enough to enable one to feel certain that the true difference is 
greater than zero. The difference of .071 would have been raised to 
.080, if etas in place of r’s had been computed. Eta,, for intelligence 
and the teachers’ marks is .352, and for intelligence and the 1.6 Q-CMd 
marks it is .432. 

In order to show that the teachers’ grades were markedly different 
from those computed by the 1.6 Q-CMd method coefficients were 
computed for these two sets of grades for both the winter and spring 
quarters. ‘These coefficients are .842 + .010 PE, and .861 + .008 PE, 
respectively. These coefficients are low enough to show that a 
marked difference existed between these sets of grades. 

The effect of replacing the teachers’ methods by either of the 
other two methods is shown in Table VI. This table gives the per- 


TaBLE VI.—CHANGES IN INDIVIDUAL GRADES PRODUCED BY THE USE OF THE 
1.6 Q-Mp METHOD AND THE USE oF THE 1.6 Q-CMp MeEtuop IN PLACE oF 
THE TEACHERS’ METHODS OF TRANSLATING SCORES INTO GRADES 





Winter quarter Spring quarter 





Amount and direction of change Marks changed Marks changed 











Number Per cent Number | Per cent 





Changes from the Use of the 1.6 Q-Md Method 





























Raised two marks.............. 3 3 6 .6 
Raised one mark............... 69 6.8 38 3.8 
cerns ae ekdiceseees 730 72.3 804 81.4 
Lowered one mark.............. 194 19.3 130 13.2 
Lowered two marks............. 12 1.2 10 1.0 
RS Ns Ne oer SO 1008 99.9 988 100.0 
CR iccsesecoeun cas 278 27 .6 184 18.6 
Changes from the Use of the 1.6 Q-CMd Method 
Raised two marks.............. 0 0.0 2 0.2 
Raised one mark............... 70 6.9 37 3.7 
re ee 613 60.8 681 68.9 
Lowered one mark.............. 317 31.4 259 26.2 
Lowered two marks............. 8 8 9 .9 
ee ae ee a 1008 99.9 988 99.9 
I yc asccenvaceseea 395 39.1 307 31.0 
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centage of change in grades which occurred by changing the method of 
translation from the teachers’ to the 1.6 Q-Md method, and from the 
teachers’ methods to the 1.6 Q-CMd method. The percentage of 
change is shown for both the winter and spring quarter grades of the 
four hundred seven freshmen. 

The percentage of grades changed by using the 1.6 Q-Md method 
instead of the teachers’ methods is 27.6 for the winter quarter and 18.6 
for the spring quarter. The lower percentage for the spring quarter is 
due to the fact that a larger percentage of teachers made use of the 
1.6 Q-Md method and adhered to it more rigidly than was the case 
during the winter quarter. The percentage of grades changed by 
using the 1.6 Q-CMd method instead of the teachers’ methods is 39.1 
for the winter quarter and 31.0 for the spring quarter. When the 
1.6 Q-Md method was used about three times as many grades were 
lowered as raised. This indicates that our teachers grade too high 
even when the degree of intelligence is left out of consideration. When 
the 1.6 Q-CMd method was used about five times as many grades were 
lowered as raised. If the teachers’ methods of translating scores into 
grades were replaced by the 1.6 Q-Md method approximately 30 per 
cent of the students’ grades would be changed, and if they were 


replaced by the 1.6 Q-CMd method approximately 35 per cent of the 
grades would be changed. 


SUMMARY 


The purpose of this investigation was to make a comparison oi 
three different procedures in the translation of scores into grades. One 
of these procedures consisted of a combination of teachers’ methods; 
another of the 1.6 Q-Md method; and the third of the 1.6 Q-CMd 
method. 

In order to obtain data for the investigation, the teachers of all 
classes in which freshmen were enrolled were requested to furnish for 
two successive quarters the grades for these classes and the scores on 
which their grades were based. The number of grades received was 
1517 for the winter quarter and 1704 for the spring quarter. 

Three sets of grades were available for each of the two sets of scores, 
those prepared by the teachers, those computed by the 1.6 Q-Md 
method, and those computed by the 1.6 Q-CMd method. From a 
comparison of the distributions of these grades it appeared that the 
teachers’ grades were the least satisfactory in three respects, that of 
consistency between the distributions of the two quarters, that of 








256 The Journal of Educational Psychology 


symmetry, and that of agreement with distributions adopted as 
standards. 

A comparison of the distributions prepared for individual classes 
by each of the three methods yielded evidence to the effect that the 
teachers’ distributions were the least satisfactory and the distributions 
for the 1.6 Q-CMd method the most satisfactory. 

The three methods were also compared on the basis of coefficients 
of reliability computed for each method on the basis of winter and 
spring quarter grades. There was no difference between the relia- 
bility coefficients for the teachers’ grades and those obtained by the 
1.6 Q-Md method, but a significant and statistically reliable difference 
was found for the coefficients of the teachers’ grades and those com- 
puted by the 1.6 Q-CMd method. 

Coefficients of correlation were computed for each set of grades for 
each quarter and intelligence test scores. The coefficients for the 
teachers’ grades tended to be the lowest, and those for the 1.6 Q-CMd 
method the highest. The only coefficients which yielded a statistically 
reliable difference were those for the teachers’ grades and the 1.6 
Q-CMd grades for the winter quarter. 

If the scores are translated into grades by the use of the 1.6 Q-Md 
method in place of the teachers’ methods approximately thirty per 
cent of the grades will be changed, provided allowance is made for the 
fact that many of the teachers employed the 1.6 Q-Md method in 
translating their scores for the spring quarter. A replacement of the 
teachers’ methods by the 1.6 Q-CMd method will effect a change of 
thirty-five per cent of the grades. Because so many grades are 
changed when this method is substituted for the teachers’ methods, 
and because it has been shown to be superior to the teachers’ methods 
it should be employed in preference to them. 
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THE RELATION OF LEARNING TO SPELLING ABILITY 


EDMUND G. WILLIAMSON 
University of Minnesota 


Several investigations of the influence of psychological factors upon 
ability to spell English words have been published since 1900.* These 
studies show that there are at least three important factors involved in 
spelling ability: Ability to perceive the essential features of “‘word- 
form,” knowledge of the meaning of specific words, and general 
intelligence. These factors, however, are not significantly enough 
related to spelling ability to explain all cases of disability. The present 
investigation, therefore, was made on the assumption that at least a 
partial analysis of spelling disability might be discovered in a measure- 
ment of ability to learn to spell. Probably learning to spell should 
not be considered as an ability independent of the others mentioned 
above, but rather the integration of visual and auditory perception, 
word meaning and general intelligence. An experimental situation 
was set up in which subjects learned to spell sixty Esperanto words 
with which they were unfamiliar.t| The advantage of such a procedure 
is that all subjects would begin the experiment approximately equal 
in knowledge of the words. We may expect all subjects to show 
progress in learning to spell, unless there are differences in capacity 
to learn. 

One hundred high school senior boys and girls representing various 
levels of spelling ability, as determined by the spelling test described 
below, were given six daily practice periods of eight minutes each to 
learn the spelling of these Esperanto words. Since about half of the 
records could not be used because of absences for one or more days, 
this study is based upon data for fifty-three students. Following 
each study period, the words were dictated to the subjects, corrected 
by the experimenter, and handed back the next day with a copy of 
the correct spelling. Observation of the subjects during the practice 
periods showed that they were writing as well as silently vocalizing the 
words they had misspelled. Many of them asked the experimenter 
again and again to pronounce the words with which they had difficulty. 





* A review of these studies by the writer is to be published in the December 1933 
issue of the Psychological Bulletin. 
t Words varying in length from four to sixteen letters were selected from E. A. 
Millidge’s Esperanto-English Dictionary. 
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To motivate the subjects, all daily records were reported to the teacher 
and pupils. 

The average number of Esperanto words spelled correctly increased 
from fifteen the first day to forty-three the sixth, an average gain of 
twenty-eight. Because of the relationship of Esperanto to other 
languages, students did not begin with zero ability to spell these words. 

To meet the lack of a satisfactory instrument for testing spelling 
ability of high school seniors, a test was constructed by selecting 
forty words from Van Wagenen’s Scale E, Division 3, which have the 
highest level of difficulty, List A, and fifty words from Roget’s The- 
saurus which were judged to be extremely difficult for high school 
seniors, List B. In this way the test contained both easy and difficult 
words. The results showed neither zero nor perfect scores among 
our subjects when Lists A and B are combined, the distribution of 
scores being approximately normal. The range of scores for fifty- 
three subjects on the first trial was from fifteen to seventy-four. The 
corresponding range for the second trial was from fourteen to eighty. 
The ninety words were next used to make a multiple choice vocabulary 
test, which made possible the correlation of knowledge of meanings 
with accuracy of spelling. 

Each subject’s percentile rank was available for the Minnesota 
College Aptitude Test consisting of four hundred eighty vocabulary 
items. This enables us to relate spelling ability to general vocabulary 
ability. 

Spelling List A has an odd-even items reliability of .84 for the first 
trial and .79 two weeks later for the second trial as is shown in Table I. 
The test-retest reliability for List A is .56. One explanation for these 
low coefficients is that the words of List A were so easy that most 
subjects received high scores, thereby reducing the variability of the 
group. List B, on the other hand, has an odd-even items reliability 
of .89 for the first trial and .92 for the second. Its retest reliability 
is +.91. List A correlates +.55 with List B for the first trial and only 
+ .23 for the second trial (not given in Table I). But A and B com- 
bined for the first trial correlate +.96 with A and B combined for the 
second trial. The average score on A and B combined is 38.8 + 11.7 
for the first trial and 42.2 + 9.1 for the second trial, an average gain 
of only 3.4 words. The average of two trials on A and B combined is, 
therefore, a better index of spelling ability than either list alone, 
although the average of two trials on B would have been satisfactory. 
List A is not sufficiently difficult to differentiate individual levels of 
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ability for high school seniors and therefore contributes less than does 
B to our measurement of differences in spelling ability. 

List C, composed of sixty Esperanto words, has a very high relia- 
bility for all six trials. The odd-even items reliability varies from .91 
for the first trial to .95 for the sixth trial. Furthermore, the inter- 
correlations (unattenuated) of the six trials are all high, given separately 
in Table II, varying from .82 for the first vs. sixth trials to .95 for 
adjacent trials (e.g. Trials 1 and 2). If we take the average of six 
trials on List C we have a measurement of learning-to-spell ability with 
a reliability of .98 (Table I). As a matter of fact each of these 
six trials individually correlated with the spelling test (average of two 


TaBLE I.—RELIABILITY COEFFICIENTS OF THE VARIOUS TESTS FOR FIFTY-THREE 
Hieu ScHoou SENIORS 








First trial 
Spelling ima « 
even* , ated f 
second trial 

RE I census ccecscdscevens .72 84 .46 . 56 

le Rh Sal Se ow tee eka Oe 81 .89 84 91 
Ds cv ane tx oenebew ewe a .66 .79 
Stains ons Whis 6s bake dhe ceerewsa cues .86 .92 

List AB (combined first trial)........... .98 .99 .93 . 96 
AB (combined second trial)...........| .90 .94 
i nin chee e eke dah ae eae .85 .92 
Dea niCLth, che we hath dean ba Rais been .90 .95 
Diet 2 tek el nal ieee aha wtwaeon s .93 .96 
Riya ot. icauls oh arwre's 2’ oth eae eae ae .93 .96 
NE sien aGids wa tink aN oak Fake ath 95 .97 
Sts ecw obese cae es Ved koe tke eee ee ees .93 .95 
List C (trials 1-3-5 vs. 2-4-6)............ .96 .98 
Spelling vocabulary A..................]  .47 .64 
Ce brett ld wb gh bGbeNh aadene a .53 .76 
GE Wh GOI gai oc ccc cceess .78 .87 
College aptitude test, sub-tests AN vs.CM|  .95 .97 

















* The Spearman-Brown Prophecy Formula is used to determine the odd-even 
reliability of the whole test. 

t Correcting for the reliability of the first and second trials. Paulson® has 
called this the coefficient of Trait Variability in the case of steadiness tests. 
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trials on A and B combined) as high as did the average of the six 
trials, namely +.88. 

The spelling vocabulary test has an odd-even reliability of .87. 
Part A, composed of words in spelling List A, has a reliability of .64 
and B a reliability of .76. Since the reliability of the whole test is 
greater than that of either part it is likely the increase in number of 
items raised this coefficient. 

The Minnesota College Aptitude Test has a_half-versus-half 
reliability of .97 for our fifty-three subjects. It is composed of four 
hundred eighty vocabulary items arranged in four forms, A and C 
being selected from the 10,000 word level of Thorndike’s Teacher’s 
Word Book; M and WN are from the 18, 19, and 20,000 word level as 
determined by Thorndike’s word usage count. The reliability of this 
test for five hundred high school seniors® is .95. 


TaBLeE I].—INTERCORRELATIONS OF ALL TESTED ABILITIES 








Tested ability 1 2 3 4 

ie ea ae iene She 6 weed uwaene ee Re 89 .72 72 
FE ee er | Pee ever .63 .62 
i. cnccawsebeeesesecs eee Sf eee eeee meres 81 
College aptitude test... .. 0... ccc cece cc ceces 4 
ESET Se See AE PP ..| 41.1 | 31.3 | 57.9 | 44.7 
ois cca wg se edengeesbacoeces 11.4 | 12.5] 9.2 | 30.6 
Me sak hbk inde acdp a tbwacwian wakw sede ee 0.28) 0.40) 0.16 0.69 




















Spelling vocabulary correlates +.63 with the spelling of Esperanto 
words, and +.72 with the spelling test. Therefore, word knowledge 
seems to have a significant relation to ability to spell. But since 
spelling vocabulary correlates +.81 with intelligence, the coefficient of 
+.63 may be influenced by the common intelligence factor. Observa- 
tion will reveal the fact that knowledge of the meaning of a word 
increases one’s familiarity with it and therefore leads to more rapid 
learning of its spelling. But since this vocabulary test did not involve 
knowing the meaning of the Esperanto words, its relation with their 
spelling probably is due to a common intelligence factor. 

The correlation of +.62 between learning to spell Esperanto words 
and knowledge of the meaning of four hundred eighty vocabulary items 
is slightly lower than the corresponding coefficient between intelligence 
and the spelling of English words and is possibly due to the common 
intelligence factor. 
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What have we discovered about learning-to-spell? First, knowl- 
edge of the meaning of our English words and general vocabulary 
ability are significantly related to ability to learn to spell. Secondly, 
the method used by our subjects in learning to spell is a confirmation 
of the importance of word-form. During each trial the subjects com- 
pared their attempted spelling of each word with the correct spelling. 
Errors were discovered in this way and the correct spelling was memor- 
ized by vocalizing the word silently and at the same time looking at 
its form. This detection of differences in the word form probably is 
the same factor which was tested by Gates! and Sister Mary’s’ percep- 
tion tests. We see, furthermore, that spelling Esperanto words 
correlates +.89 with the spelling of English words. It seems that we 
had in our experimental procedure the essential features for learning 
to spell, namely, proper motivation, detection of the specific errors in 
words misspelled, and drill in the forming of the correct associations 
of letter-sequence with the sound of the word. 


SPELLING ENGLISH WORDS AS THE CRITERION OF SPELLING 
ABILITY 


The only factors significantly related to spelling ability are word 
knowledge and intelligence. Knowledge of the meaning of words 
correlates +.72 with spelling ability, which is as high as the correlation 
with intelligence. We may infer that in most cases those subjects 
who know the meaning of our English words are the ones who do well 
in spelling these same words. The coefficient of +.72 between spelling 
and intelligence is higher than other investigators have found. 

The correlation of +.89 between spelling English words and spelling 
Esperanto words is surprisingly high. Norsworthy‘ reports a correla- 
tion of +.41 between learning and recall of German-English vocabu- 
lary. Gordon? found a relationship of +.71 between learning and 
recall of logical material. Possibly our high coefficient is caused by 
the high reliability of our tests, the wide range of abilities represented 
by our subjects and the careful control of motivation. All the factors 
or tests used in this study correlate to about the same extent with 
spelling English words and with learning to spell Esperanto words. 


‘‘SPECIAL DISABILITY’’ IN SPELLING 


The results of this experiment provide an opportunity to test 
Hollingworth’s theory* that there are subjects who have a “special 
disability” for spelling. Such individuals require an unusually large 
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number of repetitions to form the correct connections between the 
letters of words. Were there any of our subjects who showed little 
if any improvement with these six trials? Do we need this hypothesis 


of ‘‘special disability’”’ to explain this failure to improve or can these 
cases be explained otherwise? 


Tas_eE [II.—Four Sussects WuHo Mape Litr_e Proaress in LEARNING TO SPELL 
EsPERANTO WorpDs 





Trials with Esperanto words 
Sa Average of two 
Subject’s 


number trials spelling 


; college 
1 2 3 4 5 6 | English words aptitude test 


Percentile 
rank in 























11 4 8 12 14 18} 20 35 27 
22 3 3 7 10 8 13 28 4 
26 5 6 5 3 8 11 15 2 
51 3 6 7 11 14 17 39 6 














There were seven subjects who spelled between three and five words 
correctly on the first trial. Of these seven, three had made a significant 
gain by the sixth trial. Inspection of the tests themselves, shows that 
of the remaining four subjects, one had advanced to twenty correct 
spellings by the sixth trial; another to thirteen; another to seventeen; 
and the fourth subject to eleven. Here are four subjects who made 
very little progress in learning to spell these particular words. It 
might be argued that these four subjects did not try to improve, 
that they were not adequately motivated. But as a matter of fact, 
all except number 51 were observed by their teacher to be trying very 
hard to improve. The experimenter noticed these particular students 
and was impressed with their earnestness and effort to learn the correct 
spelling of these words. This was especially true in the case of number 
26, who told his teacher he wanted to go to college and if he did well in 
these tests he thought it would be easier for him to enter college. It 
seems that these four students were unable to improve significantly 
even though they were highly motivated. 

But do we need the hypothesis of ‘‘special disability” for spelling 
to explain their failure to learn? An answer to this question is found 
by inspecting the last two columns of Table III which show the 
percentile rank of these students in the College Aptitude Test. The 
low learning performance of all subjects is in all probability due to 
their poor study habits and low academic or abstract intellect. As 
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long as we have this factor of intelligence available as an explanation 
of failure to learn, we do not need to fall back on the vague theory of 
“special disability.”” A coefficient of +.72 between spelling ability 
and intelligence means that in general spelling ability and intellectual 
ability are not perfectly related, but this same coefficient does not 
mean, as is sometimes inferred, that only a little of spelling disability 
can be explained in terms of abstract or verbai intelligence. 

This emphasis upon the importance of academic intelligence for 
learning to spell is substantiated by the data in Table IV, in which is 
given the relationship between level of college ability percentile rank 
and level of spelling ability and achievement in learning to spell. 
The total range of ranks or scores in each of these three tests, college 
aptitude, spelling ability, and learning to spell Esperanto words, is 
divided into thirds. We see by inspecting Table IV that no subject 
who was in the lower or middle one-third of the range of scores in the 
college aptitude test was in the upper one-third in spelling ability. 
But we do find that eight of the subjects in the upper third in college 
ability were in the middle third in spelling ability. And we also find 
that seven of those in the middle third in the college ability test were 
in the lower one-third in spelling ability. These data mean that, in 
general, there is a marked degree of correspondence between level of 


TABLE IV.—RELATIONSHIP BETWEEN LEVEL OF COLLEGE APTITUDE TEST AND 
LEVEL OF SPELLING ABILITY AND ACHIEVEMENT IN LEARNING TO SPELL 




















— _ Achievement in 
Spelling ability, learning to spell 
College aptitude range of scores “ ern 7 ra ; 
test, range of Total — Te’ Total 
percentile ranks : 
Lower| Middle | Upper Lower} Middle| Upper 
third | third | third third | third | third 
Upper third........| .. 8 7 15 =e 5 10 15 
Middle third....... 7 11 - 18 2 11 5 18 
Lower third........| 10 10 vs 20 6 11 3 20 
Pe Te 29 7 53 | 8 27 18 53 























academic intelligence and level of spelling ability. But we note 
furthermore that ten individuals of low academic intelligence achieved 
a skill in spelling which exceeded their general verbal skill, but none of 
them became “‘good”’ spellers. 
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On the other hand, the right half of Table IV shows that in our 
experimental learning situation eleven of our subjects in the lower 
third in academic intelligence achieved a performance comparable to 
five of those high in academic intelligence. This excellent performance 
of these low subjects is probably due to intense motivation. We find, 
furthermore, three of these low ability students in the upper one-third 
in learning to spell Esperanto words. Ten of those in the upper 
third in academic intelligence are in the upper third in learning to spell; 
the corresponding number for spelling ability is seven; and five of the 
middle third in college aptitude test are in the upper third in learning 
to spell. In other words, our experimental learning situation brought 
about performance in spelling which excels that achieved in the 
ordinary school situation. The correlation between spelling ability 
and college aptitude test is +.72, the correlation between learning to 
spell Esperanto words and college aptitude test is only +.62. An 
explanation for the difference between these two coefficients is found 
in the intensive motivation of our subjects as is shown by the data in 
Table IV. Most of our subjects showed marked improvement in 
learning to spell but it cannot be concluded that they would become 
excellent spellers with additional drill and labor. We must adopt an 
attitude toward spelling similar to the usual attitude toward the 
acquisition of word knowledge, namely, that we may expect some 
individuals to learn many thousands of English words, some individuals 
to learn a few thousands, and some to learn to spell only the very 
simplest words. Furthermore, the misspelling of a few selected words 
does not indicate general spelling disability as many teachers seem to 
think. 

Within the limitations of this experiment we may state that in the 
usual classroom learning conditions an individual is not likely to be a 
““good”’ speller unless he has “high” intelligence and works hard. 
If he has very low intelligence, however, and is adequately motivated, 
he may improve his skill in spelling. Therefore, even though the 
coefficient of correlation between spelling and intelligence is only +.72 
we have cited evidence that the latter is a very important factor in 
spelling ability. Without intelligence, we cannot expect many 
individuals to achieve a high performance in spelling even with much 
drill and motivation. We find no evidence to support the popular 
belief among teachers that ‘‘anyone can learn to spell if he tries hard 
enough.” If the results of this experiment are verified by others, 
then more attention should be directed to skills of learning and to 
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general intelligence as important factors in spelling disability, particu- 
larly in the case of high school and college students. 
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THE MEASUREMENT OF THE RELATIVE INTEREST 
VALUE OF REPRESENTATIVE ITEMS TAUGHT IN 
ELEMENTARY PSYCHOLOGY* 


G. W. HARTMANN 


Department of Psychology, Pennsylvania State College 


I. STATEMENT OF PROBLEM 


It has often been alleged by not wholly facetious critics of the pro- 
fession that psychology is one of the most ‘‘unpsychologically”’ taught 
subjects in the curriculum. Instead of being invited by magnificent 
vistas of man’s mind or the fascinating intricacies of behavior, the 


student is repelled by such barren puerilities as nonsense syllables, 


auditory theories, and lifted weights. Such protests come with equal 
vigor from pupils, academic colleagues, and psychologists themselves. 
Attempts to deal with this unfortunate situation through the assign- 
ment of superior instructors to beginning classes or the adoption of 
improved pedagogical methods are only partial remedies as long as the 
actual content remains unaltered. Many psychologists would defend 
the status quo by insisting that the subject can only be appreciated by 
the better students; but as long as our institutions deliver ‘“‘normal” 
youths to us for training, a social obligation exists to make the course 
personally rewarding to every member of the class. The present paper 
is an attempt to deal with this question by means of a technique for 
selecting. the most appropriate topics to be included in the term’s work. 

Curriculum-builders in other fields have devised various methods of 
determining the value of an entire subject of instruction or any of its 
subdivisions. The main criteria of selection or rejection appear to be 
as follows: | 

1. Frequency of Recurrence.—Other things being equal, that infor- 
mation is relatively most useful which promises to function most 
frequently in the life of the individual. 

2. Cruciality.—Those items are relatively most serviceable which 
give promise of entering as factors into the most critical adjustments 
which the learner may need to make. 

3. Manysidedness.—That material is relatively of greatest value 
which is most varied and integrating in nature, 7.e., which can be 
employed in the largest number of different circumstances. 





* IT am indebted to Mr. A. S. Lick of Lebanon, Pa., for assisting in the extensive 
computations which this investigation required. 
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4. Personal Appeal.—The most useful datum is the one which 
arouses the strongest subjective interest in the learner. This appears 
to have two aspects: (1) The degree of immediate or temporary 
enthusiasm; (2) the extent to which each unit of experience promises 
to be permanently satisfying and enriching to the individual. 

This last standard was adopted in carrying out the present study. 
While it undoubtedly possesses a number of philosophical and socio- 
logical weaknesses—all the objections which have been levelled against 
hedonism and eudaemonism being pertinent here—practical regard for 
human happiness demands that it be considered. The enjoyment the 
student obtains while learning is part of the aggregate of joys which 
make his life worth while. And with respect to deferred values, has 
it not been maintained that the possibility of perennial enjoyment is 
the surest index of a classic in any field? For instance, a youngster 
learns something about after-images in his psychology course. There- 
after he may notice himself having after-images. To observe these 
and to understand their cause may give him keen pleasure or be only 
a matter of indifference. Obviously one criterion of the value of his 
knowledge about after-images is the extent of the thrill or “kick” 
experienced with each use of the resultant ability.* 

Psychologists certainly should be among the last to dispute this 
claim. We are all familiar with the Thorndikian dictum that con- 
nections grow stronger if they issue in satisfying states of affairs. In 
dealing with the controversial problem concerning the genetic primacy 
of attention versus interest, Titchener remarks,7{:* “I incline to find a 
fairly close parallel between degree of clearness and degree of pleasant- 
ness-unpleasantness, and to regard the relation between affection and 
attention, not as external, but as intrinsic.’’ One could find ample 
citations among authorities indicating that whatever situation is 
capable of being a source of pleasantness or unpleasantness is also 
likely to become enhanced in vividness. Consequently, one is pre- 
pared for Garretson’s! finding of a high correlation between the 
expressed preferences or ninth-grade boys and achievement in special 
subjects. Especially significant for our purpose are the studies of 
Himes* in the field of biology and Wray’ in chemistry, both of which 
yielded high coefficients between the pleasure experienced in recog- 
nizing a fact and its frequency as determined by the number of 
occasions it was encountered in actual life. Apparently, if we know 


* Pp. 62-68. 
+ P. 302. 
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the affective value of an item we can predict the extent to which it 
will be employed! 


II. SELECTION OF TEST ITEMS 


Several writers have drawn attention to the fact that the content 
of courses in introductory psychology shows little or no uniformity 
throughout the colleges of the country. This lack of standardization 
made difficult a purely objective construction of the interest test to be 
described but did not forbid an approximation thereto. The following 
five textbooks [see (4) for the justification of the list] were chosen as 
offering typical source material: 


Dashiell, J. F., ‘‘ Fundamentals of Objective Psychology,” 1928. 

Pillsbury, W. B., “Essentials of Psychology,” 1930. 

Ruckmick, C. A., ‘‘The Mental Life,” 1928. 

Woodworth, R. 8., ‘‘Psychology,’’ 1929 (revised edition). 

Warren, H. C. and Carmichael, L., ‘‘Elements of Human Psychology,” 1930. 


Two hundred fifty-six statements, all true, were assembled, 
including a number taken from the exhaustive standardized exami- 
nation on Woodworth’s psychology, as prepared by the author himself. 
A representative sampling and reasonable cross-section was desired, 
but the only test for proper balance among the elements which could 
be applied was the amount of agreement between the proportion under 
each heading in our list and the proportions given by Haggerty? in 
an extensive topical analysis undertaken a few years ago. With few 
exceptions, these rubrics agree with those adopted by the ‘‘ Psycho- 
logical Abstracts” in classifying its reviews, and are generally under- 
standable. Table I was prepared as a check upon the composition 
of the test in order to see if a proper distribution of emphasis among 
various topics had been made. Column 1 shows the percentages under 
each head in the interest test used in this study, column 2 the ratios 
from Haggerty’s survey (his own monograph gives only the raw 
numbers), and column 3 the proportions covered by the combined 
averages of the five major texts employed. This last comparison is 
the crudest, since it was based not upon a word count but merely upon 
the number of pages devoted to the respective topics. 

The first and second columns, in addition to being the most com- 
plete, are also the most significant. Inspection reveals a noticeable 
likeness of the two which is confirmed by a rank-difference correlation 
of .85, although it is a bit questionable to apply that technique to these 
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TaBLE I.—CoMPaRaTIVE CLASSIFICATION OF Topics ACCORDING TO DISTRIBUTION 
oF ITEMs 
(1) ee 
. cumentary 
Interest Haggerty’s analysis of 
test, per cent rite. mel texts, 
cent 
per cent 
Di I a tw 05-60.0 Rabe oe ake ee 9.6 3.94 4.8 
2. Sensation and perception......... 14.6 12.13 12.1 
3. Feeling and emotion............. 4.2 4.21 5.5 
4. Attention, memory and thought... 22.9 17.1 10.2 
BFR AIRS oc. oc. 0c ccccc noes 29.3 21.2 15.6 
6. Motor phenomena and action..... 8.2 9.12 8.6 
7. Plant and animal behavior........ 3.9 3.49 2.0 
8. Nervous and mental diseases..... . 3.9 7.19 
9. Special mental conditions......... 1.1 .92 
10. Biometry and statistics........... 3.9 4.25 
11. Mental measurements............ 3.2 1.23 4.8 
12. Evolution and heredity........... 4.2 2.83 3.4 
13. Social functions of the individual.. 1.1 3.52 3.1 
14. Additional topics not classified... . 8.2 5.40 2.5 














series. The discrepancies which do occur are perhaps due as much 
to uncertainties of classification and alternative or double grouping 
as to any real differences in relative composition. 


III. ADMINISTRATION AND STATISTICAL TREATMENT OF THE INTEREST 
TEST 


The next step consisted in printing a twelve-page folder containing 
the two hundred fifty-six surviving items in a free sequential order 
with the following directions and a sample item at the top: 


Below are a number of items which are found ih a course in elementary 
psychology. You will readily recognize that these items do not all have the same 
value or meaning to you. It is to ascertain these differences that this study is 
being made. 

Will you kindly check the items according to the degree of pleasure (that is, the 
degree of satisfaction or agreeable emotion or feeling) that it gives you to under- 
stand the item when you meet it in your experience. 

If, however, the knowledge of the item gives you a feeling of dissatisfaction, 


displeasure or annoyance, indicate the extent of your aversion in the appropriate 
column. 
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If, on the other hand, you are genuinely indifferent or neutral in your feelings 
toward the subject matter of the item, in that it causes you neither pleasure nor 
displeasure, place a check in the column marked “neutral.” 

If there are any items which you have never learned, do Nor attempt to check 
them. 


Cxueck Eacu Item AccoRDING TO PLEASURE, AVERSION, OR INDIFFERENCE. 


1. The field of psychology lies between the 
Aversion Pleasure | biological sciences on the one hand, and the social 
sciences on the other. 
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Responses were secured from over four hundred undergraduates at 
The Pennsylvania State College, mostly sophomores but with a heavy 
sprinkling of upperclassmen. All of them had completed the basic 
course in psychology and at the time of the testing (mid-semester) 
were pursuing work in educational or applied psychology, both 
required courses. A full class period was devoted to marking the 
entries with the regular instructor in charge. So far as could be gauged 
from overt attitude and the nature of questions asked, the reactions 
given appeared to register the genuine opinions of the group. 

The scoring of the individual record sheets was accomplished very 
simply by converting the seven divisions of the rating scale into 
numerical equivalents according to the key given below: 


Pleasure 
SEI Te > Se ea ees eM ea 3 
Se rN se Ceca ne eabeeakecewecheeeus 2 
NN oe eS ea uin dd eb aw ede e livee oh deans 1 
oc aecudesodadccthcneaceeWeke 0 
Aversion 
ras ik he hth alse e ated hac bikeue’t dapuvesbates -1 
es 8 cn pede s oko Oka ae eee eb esereedpusem —2 
i Na rae ee tat cle —3 
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The mean pleasure value for each item could then be obtained by 
averaging the columnar values placed opposite each item by N indi- 
viduals. Since a complete tabulation for the entire group did not seem 
profitable, a random sample of two hundred papers was removed from 
the original pile by selecting every second paper from the accidentally 
arranged pack. The mean pleasure values of each of the two hundred 
fifty-six items as determined from the first hundred when matched 
against the similar values for the second hundred yielded an uncor- 
rected Pearson r of .94 + .007, an unusually gratifying result. The 
reliability of the test itself was determined by taking another random 
half or fifty of the first hundred papers referred to above and corre- 
lating the grand mean pleasure values (7.e., the average of 128 items 
or one-half of 256) of the odd items for each subject with the corre- 
sponding means for the even items. This method gave a raw coeffi- 
cient of .90 + .018. These constants are high enough to justify 
working with a relatively small number of cases, but in order to err on 
the safe side, the ranking of all the items was based upon the mean 
pleasure indices for the entire first sampling of two hundred papers. 
This hierarchy is given in the Appendix. The highest average interest 
is attached to the statement, ‘‘The human eye is a registering optical 
instrument like the camera” with an index of 2.03, equivalent to a 
vote of a little more than moderate pleasure. The lowest pleasure 
index fell to ‘‘ Aboulia is an abnormal degree of lack of zest for action”’ 
with a value of .25, thereby approaching very closely indifference or 
zero. It is pleasing to find that no one item dropped below the neutral 
line so far as its mean was concerned, although plenty of negative 


Tas_e II.—Ranx OrpeEr or Items FoR PLEASURE VALUE WHEN COMBINED AND 
CLASSIFIED ACCORDING TO THE UsuaL PsycHoLoGicaL Groups 


Megan PLEASURE 


Rvusric VALUE 

1. Social functions of the individual...................... 1.51 
2. Attention, memory and thought....................... 1.35 
EE ES ER ee 1.33 
es epee dabeccpceneessenenen 1.32 
ie dab Viwde sees eceeden sees 0% 1.25 
Tie eS SS RE a 1.22 
7. Motor phenomena and action.....................0405. 1.18 
ie en Oe I oo cnecesecesecescceeesns 1.15 
©. Sensation and perception. .............. ccc ccs cc cece. 1.13 
i we ka nh bien d an eeane em 1.12 
BR. GOs MROMOR GORITIOMS. .... .. 0 cc ccc ccc cccceccess 1.11 
12. Nervous and mental diseases.....................0005. 86 
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checks appeared on the “‘aversion”’ side in individual records. Table 
II shows how the customary textbook and periodical headings com- 
pare with respect to reported student interest. 

If one may generalize from this list to psychological content as a 
whole it would appear that for the majority of students psychological 
information lies between ‘“‘little’’ and ‘‘moderate” pleasure and is 
somewhat nearer the former than the latter. 


IV. RELATION OF COMPOSITE PLEASURE VALUE TO INTELLIGENCE, AGE, 
SEX, AND ACHIEVEMENT IN PSYCHOLOGY 


In this section, composite pleasure value simply means the grand 
average pleasure rating of the total two hundred fifty-six items for 
each subject. This variable was matched with other presumably 
significant and available data in the hope of discovering some of the 
determinants of positive interest. 

The DeCamp intelligence test scores were obtained for one hundred 
two students who had participated in the investigation. These 
correlated .016 + .07 with the corresponding pleasure indices, indi- 
cating that little or no reliance can be placed upon variations in native 
ability as a symptom of preferences in psychology. Moreover, this 
conclusion receives further support from a similar correlation of 
.0059 + .07 with grades received in elementary psychology. This 
finding is very definitely opposed to the view, held by a number of 
“hard-boiled” instructors, that the good students are the only ones 
who are apt to be interested. 

The correlation with the ages of the one hundred one students was 
computed and found to be .25 + .06, which one may interpret as show- 
ing a slight tendency for older students to appreciate more the personal 
meaning of many psychological facts. If so, this constitutes a mild 
argument against offering psychology to freshmen or senior high-school 
pupils since maturity and experience appear to be factors of advantage 
in this particular subject. Perhaps a certain level of objectivity is 
required, which a person too near the high tide of adolescence does not 


possess. 

Sex differences in abilities and interests always constitute material 
for educational disputes. Our records on this point reveal the data 
shown at top of p. 273. 

Using the diff./SD.,,,,, technique we find a critical ratio of .95 which 
gives about eighty-three chances in one hundred that the female index 
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N | Mean composite pleasure values| SD | SDyverage 





ee seek cueea 50 1.15 54 .076 
WERE oc scccccetccs 50 1.25 .51 .072 

















is really higher than the male, a fact which seems to be in accord with 
common belief. 

In an effort to unearth any possible external causes for the differ- 
ences in pleasure value reported, it was thought that the printed length 
of the various items might affect the apparent pleasure attributed to 
them. Granting equivalence of content, it is well known that long 
awkward sentences provoke an unpleasant feeling-tone. The average 
number of words per statement in the test blank was 12.20, making 
possible the following comparison: 





N | Mean pleasure value | SD | SDyverage 





= 


Items of 12 words and under....| 128 1.25 .316 .028 
Items of 13 words and over..... 128 1.17 .301 .027 




















Using the same method involved in the preceding paragraph, one 
obtains a critical ratio of 2.05 or about ninety-eight chances in one 
hundred that a true difference is present, making it rather probable 
that the length of the statements here employed did noticeably alter 
the expression of interest. Inspection indicated that the more tech- 
nical or less ‘‘human”’ items tended to be longer—a feature which 
suggests that the interesting things are usually the clear and simple 
ones. Most of us, unless especially trained, usually experience a cer- 
tain amount of discomfort in the presence of intellectual complexity. 
We can not like what we do not fully understand. 


Vv. CONCLUSIONS 


Apparently it is possible by means of the mass rating scale here 
described to determine variations in interest value of typical items 
in any course with a high degree of reliability. However, what factors 
other than intrinsic likes or dislikes are responsible for these differences 
are as yet unknown, except that age appears to raise the general level 
of pleasure derived from meeting these statements. Phraseology, too, 
plays a part. Nevertheless, it seems evident that facts, identical as 
to objective truth, may vary widely in the pleasurable recognition 
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value they evoke. For instance, it is quite possible that some items 
were ranked low because they clashed with earlier and stronger preju- 
dices and ideals, thus leading to a negative attitude and an unwilling- 
ness to accept with good grace an “unpleasant fact.”? Another 
interpretation, based upon the mechanism of conditioning, would 
maintain that the circumstances under which these items were learned 
are responsible for the pleasure or aversion and that the facts themselves 
are absolutely neutral. However, this seems unlikely because of the 
high correlation between large random samples of students pursuing 
different professional goals, reading different texts, and listening to 
instructors with distinct personalities. 

The practical use to be made of the ranking presented in the Appen- 
dix is obvious, once we assume the validity of the arrangement, which 
should be checked at other institutions. If we wish to hold the atten- 
tion of students throughout the term’s work it would seem desirable 
to begin with those topics which offer the highest average pleasure 
value at the start, in the hope that this favorable initial impression will 
be maintained. An interesting confirmation of the correctness of the 
order of topics presented in Table II may be found in the difference 
between Woodworth’s revised edition of his ‘‘ Psychology” and the 
first edition; the revision follows the sequence of this table much more 
closely than the earlier issue. Similarly, Wheeler’s development in his 
recent “Psychology” with its theoretically grounded progression from 
the broader to the narrower determinants of behavior appears to be 
justified on another basis by the interest hierarchy we have found. 
Perhaps the logical and the psychological approaches are not so far 
apart after all! 
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APPENDIX 


The two hundred forty remaining items* of the interest test are arranged in 
descending order according to average pleasure value. The exact values are gievn 
only after the first ten and the last ten statements since the intervening values 
decrease regularly by about .01 with each step. 

1. The human eye is a registering optical instrument like the camera. (2.03) 
2. A person’s success is dependent upon his capacity to make adequate 
adjustments. (1.97) 
3. To break a habit manage somehow to substitute some other response for 
the one to be avoided. (1.95) 
4. Personality is the entire mental organization of a human being at any 
state of his development. (1.95) 
5. Play is characteristic of all young animals and babies. (1.94) 
6. Character is the sum total of a person’s habits. (1.94) 
7. What we call intelligence may prove to be a complex of several factors. 
(1.93) 
8. The individual’s physique is a factor in his personality. (1.92) 
9. The technique of administering a test is as important as the test itself. 
(1.87) 
10. Confidence aids wonderfully in recall. (1.87) 
11. The popular notion that quick learners forget easiest has been disproven. 
12. Change is the greatest factor in attracting attention. 
13. No single performance of an individual could be used as a fair indicator 
of intelligence. 
14. A review is the best way to avoid forgetting. 
15. An active attitude on the part of the learner facilitates learning. 
16. A moving object is more likely to be noticed than a still one. 
17. Fear is the emotional response to danger. 
18. Reasoning is the attempt of the mind to organize its experiences. 
19. Memory is a reinstatement of an old experience. 
20. Rhythm is a great aid in learning. 
21. The real seeing part of the eye is the retina. 
22. The younger the child the greater the degree of plasticity. 
23. Recitation has an immense value and significance in the process of 
memorizing. 
24. The nervous system resembles the telephone system of a city. 
25. Laboratory procedure is becoming more and more dominant in modern 
psychological experiments. 
26. When one person observes another person’s behavior the observation is 
called objective. 
27. To judge from the verbs he uses in talking, the young child’s concepts 
grow out of activities in which he participates. 





*In accordance with the test instructions to omit any unknown statement, 
sixteen items from the original list have been eliminated because they were checked 
by less than one-half of the group. 
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28. Learning is easier if the repetitions be distributed over several days rather 
than accumulated on a single day. 

29. A stimulus is whatever arouses the individual to activity. 

30. Habits are essential to action. 

31. Experimental psychology is becoming more and more objective. 

32. The intelligence quotient is derived by dividing the mental age by the 
chronological age. 

33. The mental age is an inadequate measure of a person’s ability. / 

34. By pitch is meant the highness or lowness of a tone. 

35. Laughter is an instinctive response. 

36. Psychology does not deny the existence of a soul, but it deals only with 
the mind as it occurs during the lifetime of the body. 

37. Fear or rage are antagonistic to the healthy body-developing processes 
of the organism. 

38. Enjoyment of art requires intellectual as well as emotional activity on the 
enjoyer’s part. 

39. There is a positive correlation between the intelligence of parents and 
children. 

40. Concepts aid greatly in the acquiring of knowledge. 

41. Seeing and hearing are counted as activities in psychology. 

42. The individual’s limitations are factors in personality. 

43. When one thing reminds you of another, it is because some connection 
between the two has previously been noticed. 

44. Differences in mental capacity are very largely inherited. 

45. The degree of learning is directly proportional to the number of repetitions. 

46. Laughter is an unlearned activity. 

47. Hunger is one of the most powerful stimulants to body activity and 
performance. 

48. To judge from puzzle experiments, human beings combine observation 
with trial and error. 

49. An impression means that a sensation has been aroused. 

50. Advertising is a form of social control. 

51. Individuals vary greatly in the vividness or realism of their memory 
images. 

52. Psychology and physiology meet in the study of personality. 

53. The human individual begins life as a single cell. 

54. Play of imagination is of service even to the scientific investigator. 

55. The skin is not uniformly sensitive. 

56. There is a physiological limit to every individual’s performances. 

57. All colors except absolute black exhibit brilliance. 

58. Memory is one of the divisions of psychology in which experiment has 
proven most fruitful. 

59. Spaced repetitions are more effective than unspaced ones. 

60. In trial-and-error behavior there is always some goal to be realized. 

61. A baby can be made to fear a rabbit by simultaneously administering a 
sound loud enough to produce a withdrawing response. 

62. The heredity of an individual enters into all his cells. 

63. Animals and children learn largely by the trial-and-error method. 
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64. Binet was the first man who endeavored in a comprehensive way to 
measure an individual’s general intelligence. 

65. The encountering of any obstruction in a response or task releases extra 
energy. 

66. Every response has its stimulus. 

67. A cook can and should use to the best advantage the oteilidis of 
contrast. 

68. A judgment is a thought which combines two concepts. 

69. Forgetting proceeds very rapidly at first, but more slowly until there is is 
no appreciable change. 

70. Behavior is more adequate as the nervous system is better developed. 

71. Winking the eye when a flying stick comes near it is a reflex. 

72. Anticipations are images based on past experiences. 

73. Sound is physically a wave motion, or vibration in the air. 

74. Feeling is a stirred-up state of the organism. 

75. A distraction is anything that works against sustained attention. 

76. Psychological experiments have proven that cramming is a bad procedure, 

77. Red is complementary to green. 

78. Inhibition of breathing often occurs at moments of eager attention. 

79. Feeling is inextricably bound up with doing. 

80. A mood differs from an emotion in being less in intensity and heme! in 
duration. 

81. Red-green blindness is very uncommon among women. 

82. Overactive adrenals furnish an explanation for the irritability of some 
individuals. 

83. Instinct is unlearned behavior. 

84. When attention seems to be divided between two performances: it is 
usually shifting back and forth between them. 

85. Perception is largely a process dependent upon earlier knowledge. 

86. Perception is a process of trial and error. 

87. Real activity occurs only where both stimulus and motive are present. 

88. Observation means the process of coming to know objects by use of the 
senses. 

89. Objects at the center of the field of vision are seen in the greatest detail. 

90. Breathing is shallower during a period of concentrated attention. 

91. When a rat runs the same maze repeatedly his time decreases from trial 
to trial. 

92. Feeling is internal rather than overt activity. 

93. The wishes that find outlet in dreams, are according to Freud, wishes that 
the individual has submerged to the unconscious level. 

94. Day dreaming is a sort of play, more distinctly imaginative than most 
other play. 

95. Psychology’s goal is the ultimate comprehension of mind to the same 
extent that chemistry and physics aim at an understanding of material phenomena, 
96. Feeling may be dominated and kept down by absorption in activity. 

97. Psychologists are sceptical of general conclusions drawn from the behavior 
of a single individual. 
98. Every mental process stands in some sort of relationship to another. 
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99. Psychological group testing on a large scale was first used in the World 

War. 

100. Girls generally surpass boys in language activities. 

101. Most children’s IQ’s remain constant from year to year. 

102. Timbre means the characteristic sounds of different instruments and 
other sources of sound. 

103. The child gains from his play much practical acquaintance with the force 
of gravitation. 

104. Some of our expressions are survivals of acts that were of practical utility 
in the lives of prehistoric people. 

105. The quickness of passage of the impulse in the nervous system appears 
to be positively correlated with intelligence. 

106. The left brain in right-handed people dominates the right in motor 
activities. 

107. Much of our thinking is symbolic. 

108. Imagery is the name given to any revived experience which was originally 
& perception. 

109. In an after-image the response outlasts the sensation. 

110. The Binet tests are individual ones. 

111. Much complex social behavior is based upon organic needs. 

112. Day dreams are motivated. 

113. The course of transmission from neurone to neurone is determined by the 
openness of the paths of connection between the sensory and the motor neurone. 

114. Propaganda is a special type of social control. 

115. The normal time for a day dream is the time when there is no real act to 
be preformed. 

116. What activity appears in a dream, according to Freud, is but the symbol 
of the underlying wishes and their fulfillment. 

117. The cerebellum is the organ controlling bodily activity and equilibrium. 

118. In memorizing connected passages of prose or poetry, the efficient pro- 
cedure consists in noting the general sense of the passage, the place of each part 
in the general scheme, the structure of the sentences and the author’s use of 
particular words. 

119. The unit of the nervous system is called the neurone. 

120. Nerve impulses always follow the path of least resistance. 

121. A stimulus is any form of energy acting upon a sense organ and arousing 
some activity of the organism. 

122. Appearance soon after birth is a more accurate criterion of native reaction 
than universality. 

123. Performance tests make use of concrete materials. 

124. Experiment makes it possible to control accurately the conditions and 
antecedents of mental operations. 

125. The field of psychology lies between the biological sciences on the one 
hand, and the social sciences on the other. 

126. Imbeciles are less intelligent than morons. 

127. Intelligence and learning power reach their maximum at ages fifteen to 
twenty. 

128. Images taken for real things are hallucinations. 
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129. In times of danger or fear, there is an abrupt stop in the activity of the 
stomach muscle. 

130. The idiot stands at the lowest level of mental development among humans. 

131. Sneezing is the earliest manifested reflex in babies. 

132. After-images occur principally in visual sensations. 

133. Long continued practice tends to produce a loss in interest and learning. 

134. The sense of touch has two elementary qualities, contact and pressure. 

135. No human reaction is too fast to be measured. 

136. Rapid adding of a column of figures is a process of controlled association. 

137. Sometimes a person worries for fear that for which he really wishes will 
happen. 

138. Human speech is a good illustration of the combination of unlearned 
movements with learned movements. 

139. Instinct is a product of natural selection. 

140. The most important part or division of the nervous system is the cere- 
brum. 

141. The psychologist adopts and maintains the impersonal attitude in all his 
research. 

142. A composite image is built up through frequent repetition of the same 
experiences. 

143. Satisfaction is less distinctive and definite than want. 

144. It is possible to learn to control some reflexes so as purposely to facilitate 
or inhibit them. 

145. The thyroid gland contains iodine. 

146. The central nervous system consists of the brain and spinal cord. 

147. Psychology is the latest outgrowth of philosophy. 

148. Recalling is a reaction to segments of behavior already fixed. 

149. Inactivity may be due to the lack of an immediate goal, even if the 
distant goal is known. 

150. The number of dots in a collection of eight to ten can be grouped at a 
single glance, provided they fall in groups. 

151. The cerebrum varies considerably in size from person to person. 

152. The lockstep system of promotion of school children has been proven to 
be a bad educational procedure. 

153. Stimulation of a well-marked cold spot on the skin will produce the 
sensation of coldness whatever the stimulus may be. 

154. A reflex becomes weaker and weaker and may cease altogether after 
repeated excitation. 

155. Social control means the modification of the direction of individual be- 
havior by the activities of other individuals or by the result of the activities of 
other individuals. 

156. The James-Lange theory opposes the popular notion that the feeling of 
fear precedes and arouses the motor behavior characteristic of fear. 

157. In the human production of voiced sounds the tonal element is furnished 
by the vocal cords of the larynx. 

158. The more immature a subject is, the less reliable is his report of a pro- 
cedure. 

159. The outer layer of the cerebrum is called the cortex. 
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160. The reflex is an immediate unlearned response to a stimulation. 

161. The rods when dark adapted respond to fainter light than the cones are 
capable of responding to. 

162. Empathy is imagining how one would feel if one were doing what the 
observed object is doing. 

163. Space relations and colors are seldom remembered correctly. 

164. The most commonly occurring eye troubles are errors of refraction. 

165. The stimulation and response are not equal in intensity. 

166. Boys surpass girls in tests of general information. 

167. Reflexes are interlocking mechanisms, some facilitating each other, and 
some inhibiting each other. 

168. The spinal cord is the center for reflex actions. 

169. Negative adaptation takes care of many stimuli that would otherwise 
interfere with sustained attention. 

170. The maladjustments of the adult are often due to persisting childish 
notions. 

171. A nerve impulse is a combination of electrical and chemical processes. 

172. Sensation is an unlearned response. 

173. Either hypnosis or psychoanalysis may be used in unearthing forgotten 
memories. 

174. Retention is a psychological process allied to habit. 

175. Pain spots require more intense stimulation than warm spots. 

176. Percepts never precisely reproduce the original physical stimulus. 

177. The striped muscles are the active muscles that are responsible for the 
change of position of the body members. 

178. Retention is unconscious. 

179. Conditioned reflexes are depended upon central connections in the 
eerebral cortex. 

180. If an inverted picture on the retina is turned right side up by aid of a lens, 
the immediate effect is to make the subject reach upwards for any object seen on 
the floor. 

181. Pain sensations are the most numerous among the cutaneous sensations. 

182. Wher used to indicate differences in native ability, intelligence tests pre- 
suppose that the individuals compared have had equal opportunities to learn what 
the tests require. 

183. The primary development of any organ or muscle is called maturation. 

184. The images that occur in the hazy state between waking and sleeping 
are called hypnagogic images. 

185. The genetic method is concerned and based almost exclusively on the 
past experiences of the individual. 

186. Recall plays somewhat the same preliminary part in reasoning as atten- 
tion does in sense perception. 

187. The auditory portion of the ear is called the cochlea. 

188. Animals observe locations better than simple mechanical relations. 

189. Aphasia is loss or disturbance of speech. 

190. Attention is related (inversely) to the nervous process of fatigue. 

191. The motivation of any extensive activity is likely to be complex. 

192. Worry is a substitute for real activity, when no real activity is possible. 
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193. One seldom if not rarely has a pure experience of the elementary type of 
sensation. 

194. The organic condition of an individual influences an individual’s response 
as much as external stimuli do. 

195. The kinesthetic sense is keener than the sense of touch in comparing the 
weight of objects. 

196. The white matter of the brain consists of nerve fibers. 

197. There is no exact boundary between noises and tones. 

198. The striped muscles constitute about a third to a half of the total mass of 
the organism. 

199. Every reflex arc passes through the gray matter of the brain or cord. 

200. Animals learn very little by observation. 

201. The odor of food in the mouth reaches the olfactory organ by way of the 
throat. 

202. Hormones are substances produced in very minute quantities by the 
endocrine glands. 

203. The spinal cord is a long cylindrical structure with thick walls and a very 
small canal down its center. 

204. The reflex is the elementary unit of action. 

205. Contrast is a heightening of effect due to the presence either at the same 
time or in rapid succession of other qualities. 

206. Increasing the strength of a sensory stimulus increases the number of 
sensory nerve fibers thrown into action. 

207. Most images are inferior in realism and in completeness to the actua 
sensory experience. 

208. The brain stem is between the upper part of the cord and the cerebrum. 

209. Adaptation denotes a difference in the experience of the original quality 
during continuous stimulation. 

210. The cornea of the eye lacks warm spots altogether. 

211. All endocrine secretions are circulated by the blood stream. 

212. The dendrite in a synapse is a receiving organ. 

213. Behavior that shows indications of reasoning is characterized by the 
inhibitions of prompt overt responses to a situation. 

214. The reflex is the functional unit of the nervous system. 

215. A conditioned reflex can be extinguished by applying the conditioned 
stimulus time after time, without following it by the natural stimulus. 

216. The auditory area is located in the temporal lobe. 

217. Revery is the best sample of free association. 

218. The contraction of smooth muscles is less prompt and more independent 
than that of striped muscles. 

219. Of air vibrators having the same rate of vibrations those with the greater 
amplitude give the louder sensation. 

220. The bitter taste is gotten principally from the rear of the tongue. 

221. The psychologist steers clear of arbitrary evaluation. 

222. The reflex arc is the nervous machinery which carries out a reflex. 

223. A conditioned response in process of formation usually appears irregularly. 

224. The butterfly shaped portion in the center of the cord is called the gray 
matter. 
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225. The autonomic nervous system deals with the more primitive and vege- 
tative functions of the organism. 

226. The endocrine glands have no special outlet. 

227. There are thirty-one spinal nerves in all. 

228. The pain nerves are not provided with accessory apparatus around the 
special receptors. 

229. Stimulation and response form the irreducible unit or element of human 
or animal psychology. 

230. Unitarism assumes that mind and body are composed of the same sub- 
stance. 

231. The static receptor is a complicated structure in the inner ear. (.60) 

232. The visual area is near that part of the cortex marked by the calcarine 
fissure in the occipital lobe. (.57) 

233. The number of salient odors is less than the number of distinguishable 
ones. (.56) 

234. The surface of the cortex is covered with rounded creases. (.55) 

235. Ebbinghaus discovered that the ratio of what is retained to what is 
forgotten varies inversely as the logarithm of the time. (.54) 

236. Ageusia is a loss of the sense of taste. (.51) 

237. Physiology is like psychology in that many of its nouns as respiration, 
digestion, and circulation of the blood are primarily verbs. (.33) 

238. The pons is a broad band of nerve fibers lying in front of the medulla and 
crossing it horizontally. (.30) 

239. All association is really free association, what we call controlled associ- 
ation, being free association followed at once by a selection from the recalled 
material of that which suits the present purpose. (.29) 

240. Aboulia is an abnormal degree of lack of zest for action. (.25) 
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IMPROVING AND EVALUATING THE EFFICIENCY OF 
COLLEGE INSTRUCTION! 


C. H. SMELTZER 


Temple University 
PURPOSE AND SCOPE OF THE STUDY 


The improvement of college instruction may involve such large 
fields as the employing of a better trained staff, more and better 
equipment, modification of the curriculum, increased accuracy in the 
measurement of the educational product, grouping, class size, aims 
and objectives, adjustments to individual differences, or an improve- 
ment in teaching method in order that learning on the part of the 
student is commensurate with economy of time and effort. Whenever 
any attempt is made to bring about improvement in college instruction 
the ultimate outcome should be manifested in the learning of the 
students. The aim of college instruction is to cause learning in those 
who come into contact with this instruction. It is often, and in one 
sense, rightly said that the major function of a faculty member is 
teaching. But in a truer sense his function is not to teach but rather 
to create maximal opportunities for the student to learn. This last 
function involves not merely procedures in classroom presentation, 
but also such control of motivation both in class and in study, such 
adjustments of work to individual differences in ability, such arrange- 
ments of routine permitting individualized help for each student 
according to his needs, and such frequent informing of the student 
regarding his progress, as will make each member of the class maxi- 
mally progressive in his educational development. 

Learning may mean any one or more of a combination of things 
such as the accumulation of factual information; the widening of 
experience by coming into contact with the world’s knowledge; the 
formation of judgements, attitudes, points of view; or in other words, 
the creating of desirable changes in student behavior and accom- 
plishments. All of these gradually are falling prey to accurate 
measurement. 

There is today perhaps no field in the improvement of college 
instruction causing more wide spread interest than the activities in 
the classroom. Neither is there a field in which there is more “ pure 





1The author is deeply indebted to Dr. Sidney L. Pressey of Ohio State Uni- 
versity under whose general direction this work was carried on. 
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personal opinion” as to what is good or bad and what should or should 
not be done; due primarily to a great lack of sincere, unbiased experi- 
mental evidence. However, opinion and speculation may be the 
pioneers of science when they serve as a working hypothesis for 
experimentation. 

The research reported here was an attempt to develop methods 
and techniques that would make the teaching of educational psy- 
chology more effective. This research was preceded by more than 
three years of preliminary work on the effectiveness of different 
methods of instruction and various analyses of teaching results, most 
of which have been published.! 

The problem was not one of comparing different methods of teach- 
ing, but instead, the development and evaluation of a method which 
would be more effective than the usual discussion—recitation—lecture 
type so common in the college classroom. This involved such problems 
as better adjustment to individual differences in terms of capacity, 
diagnosis of individual difficulties and incentives for better work, and 
changes in the activities of the instructors that not only would 
helpfully direct the learning processes but also could be evaluated. 

Summarizing then, the problem was to develop and evaluate a 
more effective technique for teaching Educational Psychology through: 
(1) A better adjustment to individual differences in capacity to learn, 
(2) a diagnosis of student educational problems and difficulties, (3) 
greater incentives for meeting minimal requirements—all of which 
will tend to raise standards. 


HISTORICAL STATEMENT 


A brief epitome of the work of the important researches, carried 
on in the field of college teaching, that bear a relationship to the 
problem is well set forth in the following quotation on page 90 from 
the 1929 Yearbook of the National Society of College teachers of 
Education: 


“‘Carefully controlled experiments to determine the relative merits of different 
teaching methods at the college level have not been widely undertaken. Published 
reports dealing with this fundamental phase of the improvement of college teaching 
are decidedly scarce, and unpublished reports have come to the attention of the 
Yearbook Committee from only one or two sources. It is probable that many 
worth-while experiments in this field have not been brought to the attention of the 





1 Pressey, Sidney 'L., and Luella C. Pressey, and Others: Research Adventures 
in University Teaching, Bloomington, Illinois, Public School Publishing Company, 
1927. 
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educators in general because of failure on the part of the investigators to write up 
and publish their experiments.”! 


A bibliography of twelve thousand references on higher education 
compiled by Newland and Toops, now being printed under the 
auspices of the Bureau of Education, Department of the Interior, 
in Washington, D. C., contains over five hundred articles on various 
aspects of college teaching. Of these five hundred articles not more 
than approximately twenty-five represent well controlled experiments 
in the field. The others are discussions and criticisms of the numerous 
phases of college teaching. What a willingness to express opinion in 
the absence of proven facts! 

Hagerty characterizes the situation thus: 

‘‘However important may be the factual results of these investigations, the 
do not constitute the major significance of our endeavors in collegiate educational 
research. This lies in the altered and altering attitude of the university com- 
munity toward the understanding and solution of college problems. Sad as it 
may seem, we must reckon among the serious obstacles to the hostile, attitude on 
the part of many college teachers. They simply accept their present teaching 


competence as adequate and are irritated if it is suggested that their ways may be 
changed for the better.’’? 


The merits of subjective procedures and criticisms are limited. 
There is a strong tendency to base methodology on practical infor- 
mation obtained through experimentation at the elementary, second- 
ary and college level. Up to the present most of the work has been 
done at the first two levels. The results of this body of excellent work, 
however, though applicable to elementary school instruction and high 
school instruction throws little or no light on the solution of college 
teaching problems. The spirit of investigation should prevail with 
reference to the many problems at each step in the educational ladder 
and transfer not implied until it has been demonstrated. 

A few of the best investigations of college teaching have been 
conducted by the following: Bane* and Spence‘ on the comparativ 





‘Gray, W. S. and Others: “Current Educational Adjustments in Higher Educa- 
tion.”” Yearbook Number 17 of the National Society of College Teachers of Edu- 
cation, Chicago, Illinois, University of Chicago Press, 1929. 

* Haggerty, M. E.: The Improvement of College Instruction through Educa- 
tional Research. Bulletin of the American Association of University Professors. 
May 1931, pp. 17, 388. 

* Bane, Charles Lafayette: The Lecture Versus the Class Discussion Method 
of College Teaching. School and Society, March 7, 1925. Vol. 21, pp. 300-302. 

‘Spence, Ralph B: Lecture and Class Discussion in Teaching Educational 


Psychology. Journal of Educational Psychology, Vol. 19, October 1928, pp. 454- 
462, 
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value of the lecture method and the class-discussion method of teaching 
educational psychology; Scheidemann,' on the lecture-conference 
method and individualized instruction; Gwinn,? on the question-and- 
answer method versus the lecture method. The findings of these 
tend to indicate no great difference between the various methods used. 
The amount of overlapping in the preparation of students is reported 
by Reeves.’ Worcester‘ investigated the knowledge of educational 
psychology students possessed upon entering the course. 

The literature contains nothing that is really similar to the adjust- 
ment of individual differences carried out in this study. 

Summarizing the merits of different methods, the 1929 National 
Society Yearbook states: 


“The limited extent to which controlled experimentation in the field of college 
teaching has been carried on leads to the suggestion that work of this kind should 
be encouraged. The repetition of some of the experiments which have already 
been tried with limited groups, together with the setting up of experimental 
conditions for the comparison of types of teaching which have not been experi- 
mentated with, would seem worthy projects for those interested in the improve- 
ment of college instruction. The studies reported in this section are particularly 
rich in suggestion of techniques for the evaluation of different teaching procedures. 
The paucity of reliable standardized tests in college subject-matter fields necessi- 
tates the use of other measures of results, more or less objective, such as the 
testimony of students or instructors, subsequent elections in the same field by 
the students experimentated upon, and objective tests administered before 
after and experimentation.’’§ 


NATURE OF THE PROGRAM 


The subject which was used in this experiment was Educational 
Psychology—a five credit hour professional course offered each quarter, 





1Scheidemann, N. V.: A Comparison of Two Methods of College Instruction. 
School and Society, Vol. 25, June 4, 1927, pp. 672-674. 

2 Gwinn, Clyde Wallace: An Experimental Study of College Classroom Teach- 
ing: The Question-and-Answer Method of Teaching College English. Nashville, 
Tennessee, George Peabody College for Teachers, 1930. The Phi Delta Kappan, 
Vol. 13, No. 5, February 1931, pp. 146. 

’ Reeves, Floyd W., and Others: “‘The Overlapping of Classes in Indiana 
University.”’ Report of a Survey of the State Institution of Higher Learning in 
Indiana. Board of Public Printing, State House, Indianapolis, Indiana, 1926. . 

4 Pressey, Sidney L., and Luella C. Pressey, and Others: Research Adventures 
in University Teaching, Bloomington, Illinois, Public School Publishing Company, 
1927. 

5’ Gray, W. S. and Others: Current Educational Adjustments in Higher Educa- 
tion, Yearbook Number 17 of the National Society of College Teachers of Educa- 
tion, Chicago, Illinois, University of Chicago Press, 1929. 
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hing at the freshman-sophomore level, and having a prerequisite of one 9 
“nce previous course (Elementary Psychology). The main groups of data ia 
and- were obtained from the classes held during the Autumn, Winter and a 
hese Spring quarters during the academic year 1930-1931 at Ohio State Ki 
sed. University. Additional data were obtained from three other colleges tt 
rted involving six instructors and two hundred twenty-nine students. At it 
onal Ohio State during each of the three quarters there were three ‘‘experi- ed 
mental” sections, in which the special procedures were used, while { 
ust- three “‘control” sections (in the Spring four control sections) were i 
, taught by a conventional procedure. A total of seven different 4 
onal instructors taught nineteen classes embodying five hundred twenty- h 
three students during the three quarters; each at some time had both 
llege an experimental and a control section, and so near as administratively | 
_ possible, the number of sections of each type which each instructor a 
eady 


had were kept equal. All sections at the beginning of the course were 


ne given a three hundred item pre test. This test covered the course to | 
on \ determine the initial knowledge of each student in the field. A ht 
larly similar three hundred item test was given at the end of the course. 1 : 
ures. The pre and end tests were revised from quarter to quarter; each iy 
— (with the exception of the Autumn pre test which contained two a 
Ares hundred multiple choice only) contained two hundred multiple choice 7 
efore questions and twenty application judgment problems each with five : 
solutions to be evaluated. The judgment problems were developed 4 
for the purpose of measuring student judgment of practical school } | 
situations. A system of weighting was used in evaluating the answers ha 
onal —all too elaborate to describe here. A number of checks for validity i 
rter, were made on all pre and end tests. The reliability coefficients ranged 4 
tion. from .90 + .008 to .94 + .007. The experimental and control sec- : 
tions were then compared as to the gains made on the end test over the 4 
pach- pre test, as will shortly be shown in more detail. Intelligence test per- 4 
= centiles also were obtained for each student, and cases in the experi- { 
ne mental and control groups paired according to intelligence, sex and | 
liana pre test score. { 


ng in i 
26. . SPECIAL TEACHING PROCEDURES t 
an The special procedures in the experimental and control sections 

- remain briefly to be described. In the experimental sections, as 
luca- finally developed, the work of the quarter was divided into weekly 
luca- units. The work of each week was reviewed using an informal- 


discussion method, on Tuesday, Wednesday, and the first thirty 
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minutes of the Thursday hour. During the last twenty minutes 
Thursday an objective test was given covering the work of that week. 
This was graded Thursday evening. The right answers to those ques- 
tions on which the students had made mistakes were written in with 
red pencil. Thereby, when the tests were returned, students could 
see at once not only where they had been wrong but also the right 
answer properly expressed. An analysis of the frequency of errors 
also was made for each class. The papers were returned at the 
beginning of the hour Friday and thirty minutes or more spent in 
discussion of the results. Those students obtaining ‘‘A” or “B” 
(constituting about thirty per cent of each class) were then informed 
that they would be excused for the Monday hour, and the remaining 
students were told that they must return Monday for an intensive 
review and a further test. The remainder of the Friday hour was 
spent on the assignment and an explanation of the work for the coming 
week. 

Thus Monday, only the students who had done merely average 
or below average work on the previous week’s material came to class. 
These students were given intensive review with special reference to 
the weak points exhibited on the Friday test. Further, since the 
experimental sections were handled by the “interview”? method, the 
instructor had much information about each student and in large 
measure could individualize the work for each of these students during 
the hour. Finally, during the last twenty minutes of the hour, a 
second test (another form of the Thursday’s test, of equal length and 
difficulty) covering the previous week’s work, was given. The final 
mark which these students received for the week was the average of 
the Thursday and Monday tests. 

The above routine seems relatively simple but it had been worked 
out little by little during the previous year, as advantages of this or 
that feature of the procedure became apparent. The following 
features should be noted: (a) Every student took at least one carefully 
worked out objective test each week, thus keeping each student 
informed regarding his status and progress, and the location of his weak 
points. (b) There was definite motivation in connection with the 
work, in that each week there was a definite appraisal, and further 
there was reward of a holiday for good work as contrasted with review 
and extra work for those doing not as well. The poorer students also 
had definite motivation in that by review study over the week-end, 
they could raise their grades by means of the second test on Monday, 
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with the potential additional penality of lowering the grade in case the 
Monday’s test was poorer. (c) There was adjustment to individual 
differences and needs in that the good students were given less and the 
poorer students more time and attention, and in connection with the 
frequent tests and interviews and the small groups Mondays there was 
an unusual opportunity for the instructor to know those students who 
most needed help, so that individual attention could be given them. 
Finally, (d) there was a definite, continued, direct, and experimentally 
guided attempt to raise standards of work in the course. 

It may also be mentioned—a matter somewhat aside but still of practical 
importance in connection with the feasibility of such a program—that by use of 
certain special cards and devices for testing and scoring, developed in connection 
with the course, cost of mimeographed materials was greatly reduced and scoring 


greatly facilitated. A conservative estimate of the number of questions scored 
during the preliminary work and the main experiment was 1,556,440. 


Such was the program for the experimental sections. The program 
for the control sections called for informal lecture and discussion. 
The general method of instruction in this respect was largely similar 
to the classroom method used in the experimental sections, except that 
work was not as definitely organized in units. Also, in the control 
sections all students, including those who did good work, came five 
days a week, the time given to the weekly tests was available for dis- 
cussion, and no interviews were held. 

Common to both experimental and control sections was this feature, 
that there was considerable intervisitation during class hours on the 
part of instructors. Through such visiting, instructors were able to 
pool their own ideas and discuss their own problems among themselves. 
It appears reasonable to assume that in both the control and the 
experimental sections the actual classroom teaching was of reasonably 


good quality. The experimental sections varied from the control 


sections primarily in the organization of work with reference to 
frequent appraisal, motivation, and adjustments to individual differ- 
ences, as mentioned heretofore. 


The above statement presents only a brief outline of the total experimenta 
program of the year. As a matter of fact, the routine was somewhat different 
each quarter. In the fall, the “instructional” tests were given only each fortnight; 
but this was not found to be often enough, and the weekly testing described was 
used in Winter and Spring. In the Spring quarter there was this innovation, that 
the Monday review and check test was optional, thus permitting indifferent, poor 





1Smeltzer, C. H.: Economizing in Test Administration. Pennsylvania School 
Journal, Vol. 80, April 1932, pp. 565-566. 
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students to cut the review and allowing interested students to benefit by this 
special review if they so desired. The interesting outcome of this latter innovation 
will be described shortly. 


RESULTS 


A brief summary of the results of such an undertaking is not easy. 
To make clear the general handling of the results a sample table of the 
Winter quarter distributions will be given in this report. Several 
other tables showing the gains and difference in gain between the 
experimental and control sections for each of the three quarters also 
will be given. Further analyses were made according to pre and end 
tests, weekly tests, intelligence, class, college, a repeated core of ques- 
tions on all pre and end tests, the data as to the number of students 
excused and shifts in attendance at the Monday reviews. No report 
will be made here of all these various analyses.!_ The consistency of the 
results from quarter to quarter is important evidence of their validity. 
In all, the investigation has been considered one of the most elaborate 
thus far made of college teaching problems. The results can be 
demonstrated most clearly if one compares the records which the 
experimental and control groups made on the end test with their 
scores on the pre test. 


INTELLIGENCE 


Table I gives a summary of the intelligence test percentiles at the 
90th, 75th, 50th, 25th, and 10th percentile points on the distributions 


TaBLeE I.—Summary STATISTICAL CHARACTERISTICS OF THE INTELLIGENCE 
PERCENTILES FOR THE AUTUMN, WINTER, AND SPRING QUARTERS 











Experimental Control Average 

Au- , ‘ Au- ™ . __| Experi-| Con- 

resol Winter | Spring Peo Winter | Spring snus dea 
90 percentile...... 88.4 | 97.2 | 93.8 | 93.5 | 96.6] 95 93.1 | 95 
Q3...............| 82.9 | 86.8 | 83.9 | 77.9 | 88.1] 81.4 84.3] 82.7 
Median........... 57.4 | 56.9 | 66 55.2} 69.8] 66.4, 60.1] 63.8 
Ol..............01 82.9] 37.7] 41.6 196.2) 41 44.1) 39 40.4 
10 percentile....... 28.6 | 27.6 | 31.9 | 26.9 | 32 31.5) 29.4] 30.1 
Students..........| 86 76 82 85 88 106 | 244 279 
































1 For a detailed analysis of this material the reader is refered to; Smeltzer, 
C. H.: An Experimental Investigation of Certain Teaching Procedures in Educational 
Psychology. A Dissertation. Ohio State University, 1931. 
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for the experimental and control sections for each of the three quarters. 
The average for each of these groups is found in the last two columns 
of the table. There is, with one or two exceptions, a marked similarity 
between the corresponding values at most points in the table. The 
small advantage that does exist favors the control sections in that they 
are, on the whole, about two percentiles higher than the experimental 
sections. Both groups were somewhat above the average for the 
total student body in the University. 


AUTUMN QUARTER RESULTS 


The various teaching procedures used in the experimental and 
control sections have been previously described. In Table II and all 
others of similar character the scores for all experimental classes were 
combined into one distribution of scores and a similar procedure used 
for all the control classes. Table II gives a summary of certain 
statistical characteristics for the distributions of the pre and end test 
results for the Autumn quarter. It may appear at a first glance that 
TaBLE II.—Various PERCENTILE VALUES OF THE DISTRIBUTIONS FOR ALL THE 


EXPERIMENTAL AND CONTROL SECTIONS ON THE PRE AND ENpD TEsTs. 
AUTUMN QUARTER 




















Pre test End test “a 
Critical 

Experimental | Control | Experimental | Control one 
90 percentile. ..... 101 102.5 175.3 168.5 
Aah dbeaicck «nid 89.5 87.2 164.6 154 
Median........ 79.3 71.9 149.7 141.9 1 
«ee 71.4 60.2 142.5 122.3 2 
10 percentile. ..... 61.6 51.5 131.6 106.8 2.4 











the scores on the pre test are very high. The fact that students are 
required to take a course in educational psychology for certification as a 
public school teacher is not a guarantee that they will enter into such 
a course with no knowledge of some of the subject-matter before hand. 
Most students entering into such a field of subject-matter do come 
with some pre knowledge, since they have been in contact, more or less 
directly perhaps, with school situations for a number of years. 

The pre test, as here-to-fore described, consisted of two hundred 
multiple choice questions covering the entire course. Although 
the students were asked to omit those questions they could not 
answer, or to guess only when they were reasonably certain of the 
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correct answer, it does seem reasonable to believe that most students 
would answer correctly as many questions as they could correctly, 
and would omit as few as possible. It was found on the whole, that 
not more than twelve per cent of all questions were omitted on the pre 
test. Since most of the questions possessed four possible answers, 
chance would permit in the neighborhood of fifty being answered 
correctly. The median score for the experimental sections was 79.3 
and for the control sections 71.9; indicating that the central tendency 
was to score 20-30 points above what pure guessing at the answers 
would permit. There is little doubt but what this, on the whole, 
indicated that pre knowledge of educational psychology was by no 
means an exorbitant amount. Ten per cent of the experimental 
people scored above 101 and ten per cent of the controls above 102.5; 
which indicated that a number of students possess more knowledge 
of the course in the beginning than others. Whether or not this 
amount is sufficient to warrant their not being required to pursue the 
course will not be discussed here. Ten per cent of the experimental 
students fall below 61.6 points and the same number of controls below 
51.5. Those in this range may be thought of as having a very, very 
limited fund of pre knowledge. 

Three things stand out in bold relief as one surveys the general 
appearance of the end test results; first, the marked scattering of 
cases in the control sections below the lowest score made by any 
member of the experimental sections; second, all percentile points for 
the control are considerably higher in the experimental with a tendency 
for the difference to increase, going from the top to the lower ranges 
of the distributions; third, there is a greater dispersion of scores in the 
controls than in the expérimentals. 

It appears that the experimental procedure favored the lower 
ranges more than the upper; however, the critical point in a dis- 
tribution of scores is the passing point, so the higher the lower per- 
centile values can be raised the higher the standards of the course will 
be raised. The 90th percentile for the experimental sections is 6.8 
points higher than for the control sections, whereas on the pre test 
they were a trifle lower. 

However, the statistical significance of the difference at the 90th 
percentile points on the end test, does not indicate absolute reliability. 
To rely on a purely statistical reliability interpretation at this point 
in the distributions would be somewhat spurious, for two reasons; 
first, many students in the upper third of the experimental sections 
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had less instruction than the corresponding group for the controls— 
due to their being excused certain days after the fortnightly tests—and 
second, since two hundred was the highest possible score on this part 
of the test both the experimental and control sections tended to 
approach the maximum score, which automatically reduces any pos- 
sible difference in the upper percentile ranges. A larger number of 
cases in the experimental section tended to reach this ultimate maxi- 
mum score than in the control sections. Nevertheless a number of 
standard errors of differences between percentiles will be shown in 
order to locate those which are most reliable. 

The critical ratio at the 10th percentile is 2.4 which makes the 
chances of the difference between the 10th percentiles for the experi- 
mental and control sections being statistically reliable, about nine 
hundred ninety-two out of one thousand. Although this reliability 
does not fully meet the criterion of certainty, for the difference being 
a reliable one, it is approached very closely. At the 25th percentile 
the chances are nine hundred seventy-eight out of one thousand and 
at the median eight hundred forty-two out of one thousand, of the 
difference being a reliable one. 

In summing up, then, the results of the teaching methods as used 
in the Autumn quarter in the experimental and control sections, as 
measured by the two hundred multiple choice questions covering the 
course, we find—that the experimental sections exceed the controls 
at all the percentile points on the end test with slight statistical 
reliability at the median and increasing to near certainty below the 
10th percentile. The teaching method lifts materially the students 
of the experimental sections in the lower ranges of the distribution, 
thereby raising standards for the course, and saving time for those in 
the upper ranges. 


WINTER QUARTER RESULTS 


The distributions of the pre and end test results for all the experi- 
mental and control sections for the Winter quarter are given in Table 
ITI. 

In order to facilitate the reading of the table the 90th percentile, 
the median, and the 10th percentile points are indicated by arrows, 
which are to the right of the distributions for the control sections. 
The line in each case is continued on to the left, until it reaches the 
distributions for the experimental sections. Each line shows where the 
percentile value of the control strikes the experimental. Another 
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TaBLe III.—DisrrisuTions oF THE PRE AND ENp Test ScORES FOR THE 


















































EXPERIMENTAL AND CONTROL SECTIONS ON THE ENTIRE TEST. WINTER 
QUARTER 
Pre test End test 
Score rapes Control Rapes Control 
mental mental 
f f 
f f 
280-289 2 
270-279 2 2 
250-259 6——|——_6 (90 percentile) 
240-249 8 10 
230-239 16— 13 
220-229 13—|—10+(Md.) 
210-219 1 11 7 
200-209 7 9 
190-199 1 5 8 
180-189 1 ts 6 
170-179 2 2 1——|——4~-- (10 percentile) 
160-169 3 2 3 
150-159 3—|—10~—(90 percentile) 1 
140-149 2 4 2 
130-139 6 10 1 
120-129 17—-|—15~—(Md.) 1 
110-119 ie! 9 
100-109 9 8 
90— 99 3 8 
80— 89 7 9 
70— 79 6—/|——4- (10 percentile) 
60— 69 3 4 
50— 59 
40— 49 1 
30— 39 1 2 
Total...... 76 88 76 88 . 
Pre test End test ion 
: . Critical 
Experi- Control Experi- Control ratio 
mental mental 
90 percentile.............. 158 156.2 262.8 257 
Q3..... 129.4 137 245 241 
Median 117.3 120 230.6 222 
ities Hn So wen wae OR 93.3 93.8 215.5 195 1.3 
10 percentile.............. 74.3 77 202.3 172 2 
er 114.8 116.8 231.1 216.8 3.2 
, |) ee 20.9 21.3 15.1 22.2 





7m 


hope oi i> oe 


. 


20 T% 
es 


Sic 


reer sas 


——_ 


(ak. 


5 
4 : 
G 
, 





Zoos |! 


QI 
10 








| | 


le) 





tical 
tio 





}.2 


Efficiency of College Instruction 295 


arrow running up or down gives the corresponding location of the 
percentile value for the experimental sections. 

The pre test scores for this term are materially higher than for the 
Autumn quarter due to the addition of the twenty judgment problems 
(one hundred questions) previously mentioned under Nature of the 
Program. For this reason one should not make any comparisons 
between the statistical characteristics of the distributions for the 
Autumn and Winter quarters. 

The pre test distributions for the experimental and control sections 
are quite similar with respect to central tendency and variability. 
The experimental groups are a few points lower than the controls at 
all the indicated percentile points with the exception of the 90th 
where they are higher. These values are given by the statistical 
characteristics of the distributions at the bottom of Table III. On 
the end test the experimentals are higher at all percentile points. 
The greatest difference exists at the 10th percentile where the critical 
ratio is two; making the chances that it would be greater than zero, 
upon further experimentation under similar conditions, about nine 
hundred seventy-eight out of one thousand. Although the statistical 
reliability of the differences at the other percentile points is not as 
great as at the 10th percentile, it is, nevertheless, important to notice 
that the experimental sections scored a numberof points higher than 
the controls at all percentile points. The students in the upper 
fourth of the experimental sections scored about six points higher than 
the corresponding group in the control sections, even though they had 
approximately a fourth less instruction than did the students in the 
control sections. 


Taste I[V.—Various PERCENTILE VALUES OF THE DISTRIBUTIONS FOR ALL THE 
EXPERIMENTAL AND CONTROL SECTIONS ON THE PRE AND ENpD TEsTs. 
SPRING QUARTER 














Pre test End test 
Critical 
:. a ti 
Experi Control Experi Control seas 
mental mental 
90 percentile..............| 169.6 173.5 256.8 239.6 
ar ed 154.6 155.6 244.5 230.3 
ECE 126.2 120 228.6 205.3 
ea ee 101.5 89.4 209.5 183.9 1.1 
10 percentile.............. 85.3 70.3 195.5 161.2 2.1 
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SPRING QUARTER RESULTS 


Table IV gives the statistical characteristics for the distributions 
of the pre and end scores for the Spring quarter. It is similar in con- 
struction and meaning to Table II which gave the Autumn quarter 
results. 

Again there is a marked similarity in the characteristics of the 
distributions for the pre test. The end test results also reveal the 
same characteristic differences as did those of the Winter quarter. 


SUMMARIZATION OF RESULTS 


As a method of summarizing the results for the entire year the 
extent to which the gain of the experimental groups exceed the gain 
of the control groups will be used. Table V gives the points gain 
(experimentals over controls) at the 10th, 25th, 50th (median), 75th, 
and 90th percentiles for each quarter. 


TaBLeE V.—GAIN IN PoINTs FROM PRE Test TO END TEsT FoR (a) THE EXPERI- 
MENTAL AND (b) THE ConTROL Groups, TOGETHER WITH THE DIFFERENCE 
IN GAIN BETWEEN THE Two GROUPS, AND THE PERCENTAGE Wuicu THIS 
DiFFBRENCE Is oF THE GAIN FOR THE CONTROL GROUPS 

































































Percentile 10th 25th 50th 75th | 90th 
Autumn quarter (171 students) 
Experimental............. 70 71.1 70.4 65.1 74.3 
ESSE ee 55.3 62.1 70 66.8 66 
SST, PET ee 14.7 8) 4 —-1.7 - 9.3 
Percentage of difference. ... 27 15 1 —2.5 12 
Winter quarter (164 students) 
Experimental............. 128 122.2 113.3 115.6 104.8 
EA ae 95 96.7 102 104 100.8 
6 iota ek a wee 33 25.5 11.3 11.6 4 
Percentage of difference... . 34.7 26.4 11.1 11.2 4 
Spring quarter (188 students) 
Experimental............. 110.2 108 102.4 89.9 87.2 
A hea bcew was 64404 90.9 94.5 85.3 74.7 66.1 
ED, dn sled ha 6s 4.08 19.3 13.5 17.1 15.2 21.1 
Percentage of difference. ... 21 14 20 20 32.4 
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An example will make the reading of the table clear. Reference 
to Table II will show that the experimental groups of the Autumn 
quarter at the 10th percentile scored 61.6 points on the pre test and 
131.6 points on the end test, making a gain of 70 points. The control 
sections made a gain of 55.3 points between the pre and end tests. 
Thus, the experimental groups at the 10th percentile made 14.7 
points more gain, or twenty-seven per cent more gain than did the 
controls. 

The percentages gained in the Autumn quarter with tests occurring 
only fortnightly, are less than when they occurred weekly. The 
difference in gross gain in points during the quarter were due primarily 
to lengthening of the test, and may be disregarded in this connection. 

In the Spring quarter when the students were allowed to come 
back as they wished, the gain at the lower end was less but the average 
and upper end gain was greater. Nevertheless, taking all three 
quarters together, the 10th percentile gains for the experimental 
groups averaged over a quarter more than for the controls. 

The differences that constantly exist between the experimental 
and control sections is apparently due to the differences in the general 
teaching and classroom procedures followed with the different groups. 
No test item that appeared in any of the pre or end tests was used in 
any of the weekly tests. As a matter of fact, the majority of weekly 
tests were of a different construction than the objective questions used 
in the pre and end tests. Coaching was controlled in that each 
quarter a new end test was constructed. The end test for one quarter 
would then serve as the pre test for the following quarter. The final 
selection of items, and the validation of the test items was made by 
members of the staff who were not teaching a course in educational 
psychology during the quarter in which the test was to be used. 
However, a core of fifty items was common to all pre and end tests. 
An analysis of these items showed precisely the same differences 
between the various groups of students as did the test items that were 
new from quarter to quarter. 


PAIRED GROUP RESULTS 


After evaluating the results as they existed in the actual ‘‘run of 
the classes’’ it was thought advisable to investigate whether the same 
results would hold if two groups were obtained where each student 
in the one was matched with a similar student in the other. This was 
done with the results for each of the three quarters. A rigid pairing 
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involving three criteria was used. The criteria were on the basis of 
(1), sex, (2), intelligence test percentile to the extent of +4 points, 
and (3), pre test score to the extent of +5 points. 
five, thirty and forty students each, thus were obtained for the 


Autumn, Winter and Spring quarters respectively. 


Groups of thirty- 


Rather than 


present the entire distributions the following table will give the mean 
and the probable error for each of the distributions. 


TaBLE VI.—SHOWING THE MEAN AND THE PE (DisTRIBUTION) FOR THE PAIRED 


GROUPS DURING EacH OF THE THREE QUARTERS. 


AUTUMN QUARTER 
























































Int. percentile Pre test End test 
Experi- Control Experi- Control Experi- Control 
mental mental mental 
EES ea 54.2 53.9 76.4 75.8 | 150.9 | 142.4 
PE (distribution)..... 14.7 15.1 8 8.2 8.7 12.5 
Winter quarter 
Er ener 72.7 72.8 | 118.7! | 120 229.3 | 212.3 
PE (distribution)... .. 15.4 15.1 17.1 16.9 17.2 23.5 
Spring quarter 
I i bigs he oa ee ee 65 64.6 | 119.8 | 119 226.5 | 200.8 
PE (distribution)... .. 15.1 15.1 20.5 20.7 13.8 19.4 




















1 The increase in pre test scores was due to lengthening the test. 


The distributions and their statistical characteristics for the 
intelligence test and pre test results should be similar for the experi- 
mental and control sections, by virtue of these being two of the criteria 
upon which the pairing was based. Table VI shows this similarity. 
The variability for the intelligence test and pre test results for the 
experimental section is almost identical with that for the control section 
during each quarter, as well as the three quarters taken together. 
The end test results show differences, both with respect to central 


tendency and variability. 


In each case the mean for the experimental 


group is a number of points higher than for the control group. The 


significance of the difference will be set forth in Table VII. 


The prob- 


able error of the distribution for the experimental section on the end 
test during the Autumn quarter was 8.7; which was only .7 higher 
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than for the pre test. While the variability for the control group 
increased from 8.2 on the pre test to 12.5 on the end test. Much the 
same trend occurred during the Winter quarter. In theSpring quarter 
the variability for the experimental group was materially decreased 
from the pre test to the end test, rather than remain the same as 
there-to-fore. The probable error for the control group, on the other 
hand, did not encounter much change between the pre and end tests. 

Since a difference exists between the two groups during each 
quarter it is necessary to evaluate the significance of the various 
differences. Table VII shows the critical ratios between the 10th 
percentiles, the 25th percentiles (Q1), and the 50th percentiles (median) 
for the experimental and control groups during each of the three 
quarters. It also gives the number of times per thousand that the 
difference may be considered a real difference. An example will make 
the reading of the table clear. The critical ratio between the medians 
for the paired experimental and paired control groups for the Autumn 
quarter is 0.9. The significance of such a difference depends upon the 


TaBLE VII.—SHOWING THE CRITICAL RATIOS AND THE NUMBER OF TIMES PER 
THOUSAND THAT THE DIFFERENCE May Br ConsIDERED A REAL DIFFERENCE 





Autumn Winter Spring 





Critical Per Critical Per Critical Per 
ratio thousand ratio thousand ratio thousand 





SR ocevess 9 816 9 816 1.7 956 
ASR eee 1.8 965 1.3 904 2.6 996 
10 percentile... . 3 999 2.9 998 3 999 























number of times per thousand it can be expected to exist upon further 
experimentation under similar conditions. This may be found in any 
table showing the percentage of area under the normal probability 
curve for various distances on the baseline between the mean and 
successive points of division laid off from the mean. This shows that 
the chances are 816 out of a thousand that the difference is a real one. 
Thus, for a critical ratio of 1.8, that is found between the 25th per- 
centiles for the Autumn quarter, the chances are nine hundred sixty- 
five out of a thousand that it is a real difference. The difference at 
all of the 10th percentiles may be considered a reliable difference. 

On the whole, then, the differences between the paired experi- 
mental and control groups are somewhat more significant than the 
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differences between the distributions showing the total class enroll- 
ments. Again, there is a marked lift in the lower ranges of the distri- 
butions for the paired experimental groups as well as a small difference 
in their favor at corresponding percentiles in the upper ranges. 


PERCENTAGE OF STUDENTS FAILING 


College instructors as well as students place considerable emphasis 
upon the problem of passing. Since definite standards for passing 
have not appeared for college subjects, any decision in this matter, is 
usually based upon an arbitrarily determined percentage of students 
who should fail or not receive credit for a course. This arbitrarily 
determined percentage of failures is usually selected from those 
students who fall in the lowest ranges of the distributions of scores. 

Table VIII shows the percentage of students who would fail in 
each of the experimental and control sections if the passing point were 
arbitrarily set at different scores. 


TasBLe VIII.—SnHowi1ne tHe NoumsBer or Stupents WHO Wovu.p FAI. AT THE 
DIFFERENT Passina PoINnts 











Winter quarter, per cent failing | Spring quarter, per cent failing 
Score 

Experimental Control Experimental Control 
210 17 40 26 57 
200 8 30 12 42 
190 1 21 17 33 
180 1 14 6 20 
170 9 4 15 
160 6 2 9 
150 5 2 6 
140 2 2 1 
130 1 2 1 
120 1 1 
110 1 1 

















By starting at the bottom of the table and reading toward the top 
we find that if the passing point were set at one hundred thirty for the 
Winter quarter no one would fail in the experimental sections while 
one per cent of the controls would not pass. If, however, the passing 
point were raised to one hundred eighty, one per cent of the experi- 
mentals would fail as compared to fourteen per cent of the control 
students. At two hundred ten the percentage of failures for the 
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experimentals and controls would be seventeen and forty respectively. 
Similar percentages of failures for the Spring quarter may be read 
from the other two columns in Table VIII. The table does not give 
higher percentages since it is seldom customary to consider failing 
more than twenty per cent of the students. 


RESULTS OF THOSE STUDENTS WHO VOLUNTARILY RETURNED FOR THE 
MONDAY INSTRUCTION SPRING QUARTER 


During the Autumn and Winter quarters the returning to class for 
remedial instruction for those students scoring low was mandatory. 
In the Spring quarter, returning to class each Monday was optional for 
all students, regardless of the letter grade they received on the previous 
Thursday test. 

A careful record was kept for each student. The Thursday tests 
were divided into five letter grades according to the percentages given 
by a normal distribution of scores. Of all the students who received 
a letter grade of A on the Thursday tests, five per cent returned on 
Monday for further instruction; thirty-five per cent of the B students 
returned; sixty-nine per cent of the C’s; eighty-four per cent of the 
D’s; and ninety-six per cent of the E students. 

The Monday re-test, each week, was another Form of the Thursday 
test, and therefore, approximately the same in difficulty. Of those 
students who returned forty-five per cent received a higher letter 
grade on Monday than the previous Thursday; 29 per cent received 
the same letter grade; while twenty-six per cent actually scored one 
letter grade or more lower on the Monday re-test. 


SIGNIFICANCE OF THE STUDY 


This study had several significant associations with established 
educational procedures. 

1. It again emphasized the value of educational research in any 
serious attempt to improve teaching methods and learning. 

2. Independent of the results which the methods of classroom 
presentation attain, this study seemed to stress that learning is 
affected very distinctly by procedures which adjust a course to indi- 
vidual differences, motivation, and orientation of the student with 
reference to progress and difficulties. Such procedures, therefore, 
would seem to be worthy of greater development, and efforts which are 
directed toward the improvement of educational effectiveness should 
not be confined to classroom presentation alone. 
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3. It rendered additional evidence to support the contention 
which favors objective measures as the more valuable method of 
evaluating the effectiveness of instruction. 


CONCLUSIONS 


The conclusions of this study may be summarized under three 
heads: 

1. Upon the most critical point in connection with educational 
standards, the passing point, evidence of this investigation indicated 
that in the use of this experimental procedure the passing point could 
be raised by more than a fourth and still have the same percentage of 
students receive credit for the course. 

2. The evidence of this study showed that the students in the 
upper fourth of the experimental sections scored higher than the 
corresponding group of the control sections, even though the students 
of the experimental sections had from a fourth to a third less time under 
classroom instruction than did the similar group of the control sections. 
On the other hand, the students in the lowest fourth of the experi- 
mental sections scored significantly higher than the corresponding 
group in the control sections, yet the groups of both the experimental 
and the control sections spent the same amount of time under classroom 
instruction. 

3. The evidence in regard to the middle half of the students in the 
experimental sections, showed that their relative standing on the end 
test also was higher than that of the corresponding group in the control 
sections. But while this difference was greater than the difference 
between the highest fourths of the experimental and control sections, 
it was not as great as was the difference between the lowest fourths of 
the experimental and control sections. 
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ORIGIN OF THE PLEASURE-PAIN THEORY OF 
LEARNING 


W. H. PYLE 
Detroit Teachers College 


Dr. Hulsey Cason in a recent discussion of the pleasure-pain theory 
of learning! says (pp. 464-465): ‘‘ The pleasure-pain theory of learning 
was originated and clearly described by Spencer, Bain and Baldwin,” 
and also: ‘‘The most widely discussed statement of the pleasure-pain 
theory of learning is Thorndike’s law of effect, but he did not add any 
new features that had not already been proposed by Spencer, Bain and 
Baldwin.” Thorndike’s latest statement of the law is to be found in 
his Fundamentals of Learning, 1932 (p. 176) and is as follows: “‘ When 
a modifiable connection between a situation and a response is made and 
is accompanied or followed by a satisfying state of affairs, that connec- 
tion’s strength is increased: When made and accompanied or followed 
by an annoying state of affairs, that connection’s strength is decreased.” 

Now, while it is true that Spencer, Bain and Baldwin clearly 
described the theory, they did not originate it. John Locke, over two 
hundred years ago in an ‘‘ Essay Concerning the Human Understand- 
ing,’ Bk. 2, Ch. 10 said: (In paragraph 3) “ Altention, repetition, 
pleasure and pain fix ideas. Attention and repetition help much to 
the fixing any ideas in the memory; but those which naturally at first 
make the deepest and most lasting impressions, are those which are 
accompanied with pleasure or pain. The great business of the senses 
being to make us take notice of what hurts or advantages the body, 
it is wisely ordered by nature, as has been shown, that pain should 
accompany the reception of several ideas; which supplying the place of 
consideration and reasoning in children, and acting quicker than 
consideration in grown men, makes both the old and young avoid 
painful objects, with that haste which is necessary for their preser- 
vation.” It seems clear that Locke had in mind the survival value of 
pleasure and pain and their effectiveness in learning and memory. 
For Locke, if I interpret him correctly, both pleasure and pain function 
in memory. Pleasure tells us how to respond to a situation, and pain 
tells us how not to respond to a situation. For Thorndike, if I under- 
stand him, pain prevents the repetition of a response which was 





1 The Pleasure-pain Theory of Learning, Hulsey Cason. Psychological Review, 
Vol. XXXIX, No. 5, Sept., 1932. 
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previously accompanied by pain. In his terminology, the pain 
weakens the bond and thereby prevents the response having unpleasant 
accompaniments. Of course, the unpleasantness is just as positive 
in its effects on behavior as is pleasantness. Pleasantness and unpleas- 
antness have equal survival value but act in opposite ways. 

Aristotle, some twenty-two centuries earlier, had something like 
the same notion. In ‘De Anima’ he said: ‘Sensation, then, is 
analogous to simple assertion or simple apprehension by thought and, 
when the sensible thing is pleasant or painful, the pursuit or avoidance 
of it by the soul is a sort of affirmation or negation. In fact, to feel 
pleasure or pain is precisely to function with the sensitive mean, acting 
upon good or evil as such. It is in this that actual avoidance and 
actual appetition consist.’ 

While modern experimental psychology has done much to elaborate 
and prove many old conceptions, their origin, in a surprising number of 
cases, was in the minds of certain ancient Greeks. 





1 Aristotle’s ‘‘De Anima,” Tr. by R. D. Hicks, Cambridge, 1907, p. 141. 
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COMMUNICATIONS AND DISCUSSIONS 


THE STANDARD ERROR OF THE SPEARMAN-BROWN 
FORMULA WHEN USED TO ESTIMATE THE LENGTH 
OF A TEST NECESSARY TO ACHIEVE A GIVEN 
RELIABILITY 


EDWARD E. CURETON 


Alabama Polytechnic Institute 
The Spearman-Brown formula is ordinarily written, 


nr 

= +(n — lr (1) 
If we have given a test to a group and know its reliability, r; and desire 
to know the length of a similar test necessary to achieve some given 
reliability, R, we may solve (1) for n, obtaining, 


R-—rkR 
— oe (2) 


The new test, to have a reliability R, should be n times as long as the 
test which has the known reliability r. But since r has been obtained 
by calculation from the scores of a group of N individuals, it is subject 
to sampling error, and so also is n. We require, therefore, a formula 
for the standard error of n as given by (2). Taking logarithmic 
differentials, 








n R-—rkR; r—rR 


Squaring, summing for all samples, dividing by the number of samples, 
and reducing, 


dn _ Rdr (l1- R)dr 





9 
on? co," 


w Pan 
Substituting for o,? its value, (1 — r?)?/N, multiplying through by n?, 
and extracting the square roots of both sides, we have finally, 
o, = M+”. 
, r/N 

This is the formula required. 

Suppose, for example, a test has been given to a group of 100 sub- 
jects, and its reliability has been found to be .60; and suppose we 


desire to know how long a similar test would have to be to have a 
305 
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reliability of .90. Substituting with an r of .60 and an R of .90 in (2), 
we find that n is equal to 6, or that the new test should be six times as 
long as the one already given. But in order to know how much 
confidence to place in this conclusion, we calculate o, from (3) and 
find that it is 1.6. There is therefore some appreciable probability 
that the length of the new test might properly be anywhere from 2.8 
to 9.2 times that of the given test, in spite of the fact that 100 cases is 
usually considered adequate for the computation of a reliability 
coefficient. 





REPLY TO DR. LINDQUIST’S “FURTHER NOTE” ON 
MATCHED GROUPS 


MORDECAI EZEKIEL 
United States Department of Agriculture 


In a recent article,t Dr. Lindquist made a vigorous attack upon 
the use of Student’s method suggested in my paper.? Two issues 
are involved: (1) Is the use which I proposed for Student’s method 
valid, and in accord with recognized statistical methods? (2) Were 
my criticisms of the method proposed by Wilks and Lindquist unduly 
severe? 

The use proposed is in exact accord with the statement by Fisher, 
R. A.: “Statistical Methods for Research Workers.”’ Second edition, 
pp. 110-112: 


**In cases in which each observation of one series corresponds in some respects 
to a particular observation of the second series, it is always legitimate to take the 


differences and test them . . . asasingle sample . . . A more precise comparison 
is obtainable by this method only if the corresponding values of the two series are 
positively correlated . . . to a sufficient extent to counterbalance the loss of 


precision due to basing our estimate of variance upon fewer degrees of 
freedom.” ., 





1 Lindquist, E. F.: A Further Note on the Significance of a Difference between 
the Means of Matched Groups. Journal of Educational Psychology, Vol. XXIV, 
No. 1, 1933. 

2 Ezekiel, Mordecai: Student’s Method for Measuring the Significance of 3 
Difference between Matched Groups. Journal of Educational Psychology, Vo. 


XXIII, No. 6, September, 1932. 
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Dr. Lindquist’s criticisms are based upon a failure to recognize 
the mathematical assumptions involved in Student’s method. Wilks’ 
assumptions are clearly indicated in his basic article:! 


“For each man of a certain height in one group, there is a man of the same 
height in the second group . . . What criterion of significance should be used on 
the differences in corresponding statistics in the two groups . . . The criteria for 
the significance of such differences in ordinary random sampling are not applicable, 
because although one of the groups might have been drawn at random .. . the 
other group is restricted in that its distribution of heights is made identical with 
that of the first group, and inasmuch as height and weight are correlated, the 
weights of the second group can not be considered as purely randomly distributed.” 


Wilks’ formulae apply only to a universe of specially selected samples. 
In Student’s method, no such drastic restriction of the universe is 
involved. The individual-difference method applies to a universe of 
individual differences. Each case in category A is matched with a 
similar case in category B, according to some criterion X; the differ- 
ences, Yz — Y,4, constitute the individual items whose average magni- 
tude is to be determined. A sample (of these differences) of any 
desired size is taken; the variance of the average of that sample is 
given by the formula 


o 2 BA 


n—l 


2 ae 
o'Mp-a 





This formula applies to repeated samples of the same size, drawn 
strictly at random from the universe of differences. It is not necessary 
that additional samples have the same distribution of the X-criterion 
as the original sample; all that is necessary is that each individual 
difference be derived from an A-item matched with a corresponding 
B-item, according to the criterion X. After that point, all the theories 
of random sampling apply. 

Wilks’ assumes not only that Group A and Group B have the 
same distribution of X in the initial sample (as would be necessary 
under Student’s method to match the cases), but that each time the 
sampling is repeated, each additional sample shall maintain the same 
distribution of X.2. Under Student’s method this is not required; if 





? Wilks, Samuel 8.: On the Distributions of Statistics in Samples from a Nor- 
mal Population of Two Variables with Matched Sampling of One Variable. 
Metron, Vol. IX, No. 3-4, 1-II]-1932. 

? Dr. Lindquist states in a letter: “‘I would consider the restrictions imposed by 
Wilks’ method a distinct advantage rather than a disadvantage in most educational 
research. Your method requires that the individual differences in the experimental 
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the sampling were repeated several times, the distribution of X might 
vary at random in each successive set of matched groups. 

There is thus sound basis for the use of Student’s method which I 
proposed. The practical reason for preferring the method is that it 
greatly reduces the number of necessary computations, as compared 
with Wilks’ method. In commenting on this, Dr. Lindquist has been 
hardly fair, for he mentions “‘the laborious manipulation of data 
characteristic of perfect pupil-to-pupil matching procedures,” and 
later, ‘‘assuming a perfect item-to-item matching for a given pair 
of matched samples, Ezekiel computes .. . ” (italics mine). ‘“‘ Per- 
fect’”’ matching is not required with Student’s method. In my article, 





groups be considered as a strictly random selection from a defined universe of 
individual differences. For example, in an educational methods experiment in 
which pupils are matched with respect to intelligence or achievement test scores 
and later compared on the basis of final measures of achievement, it would be 
necessary by your method to assume that the distribution of differences in these 
measures of achievement for paired individuals represents a random selection from 
a defined population of such differences. We are not accustomed in educational 
research to define populations in terms of differences, and it would not be con- 
venient to do so. Neither would it be easy to demonstrate that the sample of 
differences involved in a single experiment may be considered as a strictly random 
sample of a defined population of such differences. In using Wilks’ method, on the 
other hand, it is not necessary to assume that the distribution of differences is a 
random selection from any population, nor is it necessary to make any assumptions 
concerning the form of the distribution of measures on the basis of which the match- 
ing is effected. Wilks’ method, for example, would describe the reliability of the 
differences in mean achievement for pupils at the same level and with the same 
range of intelligence as those involved in the experiment, and would lead to a con- 
clusion such as the following: The difference between Method A and Method B is 
statistically significant for pupils at the same level and range of intelligence (or 
any other trait used as the matching basis) as those in the experimental groups. 
The general validity of this conclusion would depend, of course, upon the degree to 
which the experimental group is representative of any population to which the 
general conclusion is to be applied. But the procedure is surely no less scientific 
than that of the biologist who generalizes to human beings from experiments 
performed with rats in the laboratory. Using Wilks’ method, one can deliberately 
select the representative rather than a random sample and be able to make definite 
statements concerning the reliability of the results within the limits imposed with- 
out being concerned about randomness of sampling. Considering the difficulty 
of securing strictly random samples from any defined population in educational 
research (because of the linking of crucial characteristics within the sub-groups— 
school classes—constituting the samples that actually are used), and also con- 
sidering the great variety of ways in which populations may be defined and the 
vagueness with which they are, in fact, usually described, I repeat that the restric- 
tions imposed by Wilks’ formula represent a very distinct advantage.” 
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I stated “‘for each case in one sample there is a corresponding case in 
the other sample, with practically identical characteristics in so far 
as the matching criterion is concerned.” In the accompanying 
arithmetic example, I showed cases matched where the IQ’s differed 
by as much as five points. The quotation from Fisher states that not 
perfect matching, but some significant correlation with respect to 
the control factor, is all that is necessary. In practice, any two 
samples of the same number of cases, and with similar distributions, 
such as required by the Wilks’ method, could probably be thrown into 
pairs with sufficient correspondence for the use of Student’s formula. 

I apologize for my lack of care in constructing empirical data 
to illustrate the way the alternative methods operated. The data 
Dr. Lindquist criticizes were thrown together without any deliberate 
bias, to illustrate the differences in the detailed computations in the 
two methods. My advocacy of Student’s method for this case, how- 
ever, was based upon knowledge of its general usefulness in parallel 
problems in agriculture and other fields. 

I also apologize for my statement that Wilks’ method was essen- 
tially a rediscovery of Student’s method. Wilks’ method is new, in 
that it is based upon somewhat different assumptions, as already 
explained. In addition, Student’s method can be applied only to the 
difference in the means, whereas Wilks’ provides a method for deter- 
mining the reliability of differences in other constants such as correla- 
tion coefficients, determined from matched samples. In this further 
field his method is unique. 

My statement in my previous article, that Student’s method 
gives a more exact measure of the reliability of the differences, was 
incorrect, and I wish to withdraw it. Student’s method and Wilks’ 
method give equally valid results, for the different assumptions upon 
which they are based. Wilks’ formula does constitute a “new con- 
tribution to the theory of mathematical statistics’ for the specially 
restricted field to which it applies, but Student’s method is equally 
valid in the use for which I suggested it in my article. 
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BOOK REVIEWS 


R. F. Youne. Comenius in England. London: Oxford University 
Press. 


Mr. Young’s research into the visit of Comenius to England is an 
important contribution to the study of the history of education. It 
not only rounds out our knowledge of the movement which led to the 
founding of the Royal Society but links some of the leaders in this 
movement with the early history of education in this country. In an 
earlier pamphlet (Comenius and the Indians of New England, 1929), 
Mr. Young had already dealt with the tradition that Comenius was 
offered the presidency of Harvard College. In the present volume he 
presents the plans for the higher education of the Indians in Virginia 
and New England which were being considered in England in the 
seventeenth century. The presentation of these plans is, however, 
only incidental to the discussion on the basis of documentary material 
of Comenius’ interest and active participation in the movement for 
the organization of a scientific college in England and the pansophic 
movement, which was to serve as the basis of international under- 
standing. These aspects of Comenius’ activities, which were to him 
his dearest interests, are, as a rule, unknown to students who only 
remember Comenius for his Great Didactic and the Orbis Pictus. Mr. 
Young brings out the influence of the Zeitgeist on Comenius, whose own 
accounts of his visit to England and of his plan for a Pansophic College 
are supplemented by letters of his contemporaries (Hartlib, Collier, 
and Dury). The significance of Comenius’ scheme is emphasized 
in a “‘table of dates illustrating the development of Scientific Societies 
and the evolution of encyclopedic ideas in their bearing on schemes 
for the reorganization of society on a new basis and for the spread of 
European civilization among the natives of America.” 

The volume is valuable not only because of its contributions to our 
knowledge of Comenius but as a model of research in the history of 
education. Mr. Young makes the statement that “it was difficult to 
enlist the interest of any statesmen and almost impossible to secure 
any effective support from the general public”’ for scientific discoveries 
without any obvious practical application; in the United States in the 
twentieth century it seems to be impossible to arouse interest in or to 
secure support for humanistic or any other studies in education, which 
are not pseudo-scientific. Such a work as Mr. Young’s should, how- 
ever, help to keep alive an interest in the history of education, if for 
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no other reasons than because of Comenius’ visions of a democratic 

school system and of international understanding through education 

one is gradually being realized and the other is at least beginning to 

be revived. I. L. Kanpet. 
Teachers College, Columbia University. 


J. H. LanpMan. Human Sterilization. New York: The Macmillan 
Company, 1932. Pp. XVIII + 341. 


Every elementary student of social work knows there are socially 
inadequate people in the world. They also know that sterilization 
is one way of restricting the multiplication of such people. But is it 
the most satisfactory way? For all kinds of inadequates? ‘Socially 
inadequate”’ is a very inclusive term. Some state legislators have 
tried to include in this classification all the people they had little or no 
use for. Fundamental to a sane consideration of human sterilization 
as a social therapeutic agent is a knowledge of the biological founda- 
tions for the various social inadequacies. Obviously sterilization is 
not the intelligent solution for solving acquired defects which are not 
genetically transmissible. A eugenic program not based on the facts 
of experimental genetics as well as a knowledge of the related social, 
legal or medical implications is not likely to be characterized by either 
balance or intelligence. The most comprehensive treatment of the 
sterilization movement, its foundations and implications that has come 
to the attention of the reviewer is Dr. Landman’s book, ‘‘Human 
Sterilization.”” The book is well documented and contains an 
index. 

Readers familiar with the thoroughly stupid law which was 
advocated in the state legislature of Missouri a few years ago, a bill 
which included criminals and chicken thieves with the socially inade- 
quate who should be sterilized, are aware of the need for a more dis- 
criminating attitude in answering the question, ‘‘Whom shall we 
sterilize?”’ Of almost equal importance is the question, ‘‘ Who shall 
do the deciding?’ A seemingly sensible selection of personnel to 
pass on such problems for institutional cases is that contained in the 
California law of 1917 which vests the authority to a board of trustees 
on the recommendation of the superintendent, a clinical psychologist 
with a Ph.D. degree and a physician. H. MELTzER. 

Psychiatric Child Guidance Clinic, St. Louis, Missouri. 
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M. D. Vernon. The Experimental Study of Reading. New York: 
The Macmillan Company, 1931. Pp. XV + 190. 


EpwarpD WILLIAM Dotcu. The Psychology and Teaching of Reading. 
Boston: Ginn and Company, 1931. Pp. V + 261. 


These two publications are of radically different character. The 
former is a summary of research; the latter, a book for primary 
teachers. 

The Vernon book deals chiefly with eye movements, although there 
are chapters on perception, the reading of children, and typographical 
factors. The author has done research in this field, and the work 
gives every appearance of being a careful and comprehensive survey. 
It does not treat the subject of reading tests, nor the statistical studies 
based upon them. The book is above all a scientific treatise, and is 
probably too technical for undergraduates, but will be valuable for 
professors and those doing research. 

The Dolch book, on the other hand, will find its place in the train- 
ing of primary teachers. Many pedagogical developments seem to 
illustrate the Hegelian Dialectic in exhibiting three phases. There is 
first a vigorous favorable propaganda; this calls forth a conservative 
unintelligent opposition; finally there is a critical evaluation. The 
book under consideration belongs to the latter phase. Various tech- 
niques, such as presenting phrases on flash cards, are carefully analyzed 
in order to ascertain just what the effects are. Reading tests are 
subjected to a similar scrutiny to see exactly what is being measured. 
The reviewer considers this to be a valuable book. It is clear, well 
written, and holds interest. It seems also to exhibit soundness of 
judgment. MELVIN Rica. 

Kenyon College. 


LAWRENCE Sears. Responsibility, Its Development through Punish- 
ment and Reward. New York: Columbia University Press, 
1932. Pp. 1X + 198. 


A description of the plan of this study, necessary in a review, should 
not frighten readers away for there is value in it for a variety of poten- 
tial readers. The author explains his purpose as “an analysis of those 
types of control or education which may be called moral as dis- 
tinguished from legal or physical coercion, and which aim at the 
development of responsibility.”” Obviously, its primary concern is 
the effectiveness of punishment. 
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The study consists of three parts: (1) an analysis of the problem 
according to certain ethical theorists selected ‘‘ because of their intrin- 
sic significance, and because of the influence they have had”’; (2) an 
abstract of twelve case histories of problem children selected from 
fifty elaborate case histories, and (3) a reexamination of the theories 
in the first part in the light of the empirical data. ‘‘The fundamental 
premise of this study is that no merely logical analysis of an ethical 
situation is adequate unless there is added to it an empirical examina- 
tion of facts, and a testing of conclusions.’””’ Whether or not the 
premise is adequately sustained we leave to the philosophers to deter- 
mine. The study interests us from other points of view. 

The twelve subjects of the case studies represent the “‘bad boy”’ 
common to school communities, that minority of whom it has often 
been said they can be appealed to only ‘‘through their skins.””’ Corp- 
oral punishment in these cases, however, did not make the appeal in 
the sense that it corrected behavior or taught the children to assume 
responsibility for their actions. The cases are not overdrawn but 
“are based on data recorded by experts in the field of clinical child 
guidance.” It is clear from the abstracts that the data were assembled 
by psychiatrists and trained social workers. The indefinite references 
to “the doctor’? would have more weight if they had been more 
definite in naming the specific types of trained worker involved. The 
twelve cases are recommended for those members of the tcaching 
profession who still believe in the efficacy of corporal punishment. 
Others in the profession more advanced in their thinking will find the 
chapters on Punishment and, Encouragement and Affection stimulat- 
ing to a new synthesis of the broadening scope and implications of 
education. An understanding of these chapters will give definition 
to the vague, intuitive feelings that education is a long term process. 
To those who would defend the schools and who seek a legitimate 
field of endeavor to help without encroaching on established authorities 
to the detriment of all, a study of the well staffed child guidance 
clinic and the uses of mental hygiene is implicit. The cases cifed by 
Sears indicate the futility of corporal punishment and forcibly demon- 
strate that misbehavior is a symptom of deepseated problems, symp- 
toms generally disguised and not open to diagnosis except by the 
specialist. There ought to be, and undoubtedly will be in the near 
future, an intensive application of mental hygiene to public education. 
This study of responsibility is another compilation of evidence assuring 
that goal. J. H. Coteman. 

Huntington, N. Y. 
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DonaLtp W. MituEr. An Orientation in Educational Psychology. 
Boston: Richard G. Badger, 1932. Pp. 234. 


Motivated by James Harvey Robinson’s charge that a college is a 
place where there is much teaching but no learning, the author has 
endeavored to supply a text of a radically different type. The book 
presents seventy problems. Some of these selected at random are: 
“Why Study Educational Psychology?” ‘Inherited Defects,” 
“‘Coefficient of Correlation,” ‘Learning Curves,” ‘Thumb Sucking.” 
The text in each case merely states the problem, and lists a great 
many references. From these the student is to secure information 
for the purpose of writing a paper which shall embody his conclusions. 

The idea of the author is to create a genuine curiosity on the part 
of the student, and to set up a concrete goal for his reading. The 
preparing of the report is also expected to motivate some real thinking. 
The student must select, compare, note differences in viewpoint, try 
to account for discrepancies, and state his own conclusions. 

This type of text will be fatal to the half baked college instructor 
whose knowledge of the subject scarcely extends beyond one or two 
books. There is practically no information in the text itself, and the 
instructor must of necessity read a large number of the references and 
have background enough to evaluate them. 

A possible criticism of the scheme is that it puts some strain on the 
college library. There are six hundred seventy-six references cited; 
among these are over fifty different periodicals. It is easy to say that 
any college library ought to have these books and magazines; the 
reviewer is dubious as to whether some college libraries do have them. 
Even in a large university a certain amount of ingenuity of planning 
will be needed; librarians become somewhat frantic when sixty students 
demand the same magazine article at the same time. It is, however, 
by no means essential to make all the references available, and the 
students may work on the several problems at different times, although 
another problem arises in regard to the correlation of classroom 
activities with these variously timed reports. 

In short, the reviewer regards the attempt with sympathy, but 
suggests that some of these difficulties be considered by those adopting 
the book. To all instructors of psychology, however, he recommends 
it as a valuable bibliographical reference. ME vin Riaa. 

Kenyon College. 
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LAURENCE A. Petran. An Experimental Study of Pitch Recognition. 
Pp. 124 (paper). Psychological Monographs, Volume XJLII, 
No. 6. Princeton: Psychological Review Co., 1932. 


This study is a valuable contribution to the psychological and 
musical problem of so-called absolute-pitch. It contains a compre- 
hensive, critical survey of previous investigations, and carries the 
solution of the problem further through a somewhat new angle of 
approach and method of procedure. 

Two investigations were made, the major one with a Kobl tone 
variator, blown from a regulated oxygen tank, the purpose being to find 
how narrow a range would be identified with a given note of the scale 
and how accurately and constantly this range would be located. Abso- 
lute pitch, relative pitch and control groops were used. The other, 
less extensive, was made with the piano to test the accuracy of iden- 
tifying these tones, the aim here being to work under conditions 
approximating those usually followed, thus making possible comparison 
of results and their validity. 

Dr. Petran is himself an accomplished musician, in both the instru- 
mental and creative fields. This combination of the psychological 
and musical viewpoint in an investigator is indispensible in studies of 
this kind. 

His findings support the extensity theory of pitch rather than the 
resonance theory. There is some “range” in absolute pitch for even 
the best reactors, this range being very narrow in the best cases and 
preponderantly restricted in general to a half-tone, plus or minus. 
There are marked individual variations in amount and consistency of 
range. The accuracy in some cases depends also upon timbre which is 
natural since we deal here with a memory function, but the most 
expert reactors are practically free from such secondary criteria for 
judgment. The study further contains an extensive bibliography. 

Dr. Petran is a graduate of the Johns Hopkins University. He was 
also graduated in composition from the Peabody Conservatory of 
Music where he is at present teaching and carrying on further research 
studies in music. Orro ORTMANN. 

Peabody Conservatory of Music, Baltimore. 


Auma M. Norton. Teaching School Music. Pp. 248. Los Angeles: 
C. C. Crawford, University of California, 1932. 


This is a very successful attempt to present the salient features of 
the aims, standards, and methods of procedure of school-music teach- 
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ing. It is both concise and comprehensive; no single method is unduly 
stressed; instead the advantages and disadvantages of standard pro- 
cedures are systematically linked and briefly surveyed. The book, 
accordingly, will serve very well as a text-book in teacher-training 
classes and at the same time as a handy reference volume for experi- 
enced teachers who need a shift of viewpoint. Chapters on Notation, 
Individual Differences, Interest, fPart-work, Syllable Work, Music 
in Rural Schools, and various activity programs give an idea of the 
scope and practicability of the text. And the statement that the music 
in the schools will improve when the teachers improve is all too true. 
Far too much stress in the past has been placed upon purely academic 
and too little upon the musical training and ability of the teacher. The 
methods advanced in the book must be interpreted as subject to modi- 
fication by inherent differences, many of them quite subtle but still 
important, in details of tonality, rhythmic patterns, melodic outline 
and note-printing. Thus the derivation of minor as the submediant of 
major is open to question. A treatment of these aspects would have 
lent added value, but would also have gone beyond the aim of the 
author and extended a single volume to undue proportions. 


Otro ORTMANN. 
Peabody Conservatory of Music, Baltimore. 


E. GeorGe Payne. Readings in Educational Sociology. New York: 
Prentice-Hall, Inc., 1932. Pp. V + 376. 


This is the first of a two-volume series dealing with materials which 
are primarily intended to aid the instructor to make what the author 
calls a socioscientific approach to the study of education. It is a 
worthy endeavor on the part of the author and should be greatly 
appreciated by those who are particularly interested in the field of 
educational sociology. The concreteness of data incorporated in the 
volume makes it unique in this respect. 

A glance through the contents of this volume impresses one with the 
extensiveness of the use of the topics or subjects in other related 
fields—particularly in adolescent, educational, and social psychology, 
and social philosophy. But a more complete analysis of the subjects 
and their treatment by the various writers reveals that the content 
has been selected with reference to organizing new-type materials for 
courses in educational sociology. 

It is needless to say that any new field of thought or science embark- 
ing upon its own right and attempting to keep house under its own 
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roof will find much opposition. The author is apparently well aware of 
this fact, for he has presented in the first chapter considerable informa- 
tion to clear up the erroneous conception of materials, principles, and 
techniques which rightfully belong to the field of educational sociology. 

The reviewer believes that this volume is a distinct contribution to 
the newly growing science of educational sociology. Students not 
only of sociology but also of education, philosophy, and psychology 
will find the readings very helpful in their work. 


RosBert G. Simpson. 
Carnegie Institute of Technology. 


Wittt Scuonaus. The Dark Places of Education. New York: 
Henry Holt and Company, 1932. Pp. 351. 


For ease of book-making Dr. Schohaus’s method can be recom- 
mended. As the editor of a Swiss educational journal he asked his 
readers to send him answers to the question: “From what did you 
suffer most at school?” From the replies the editor selected seventy- 
eight of the most typical reports, printed them and added an introduc- 
tion of one hundred pages. The translation from the German has 
been done efficiently by Mary Chadwick. 

As one reads the reports, astonishment deepens to find that appar- 
ently so many people were supremely unhappy in school. Is this 
situation peculiar to Swiss schools, or is the condition common through- 
out the world? Dr. Ballard, who writes a brilliant introduction, 
believes that a common complaint has been voiced, but the reviewer’s 
experience with the schools of three countries leads him to believe the 
statements are extreme. 

Dr. Schohaus, as a disciple of the great teacher and humanitarian 
Pestalozzi, pleads for more humanity and understanding on the part of 
teachers, and with this plea few of us would be disposed to quarrel. 

P. SANDIFORD. 

University of Toronto. 


Leon1t Kaserr. Educagao dos supernormaes. Rio de Janeiro: 
J. R. Oliveira & Cia, 1931. Pp. 296. 


The title of this work by Professor Kaseff of Rio de Janeiro might 
lead one to expect a minute analysis of the topic, but rather it is a 
general, although thorough, treatment of the subject of how and why 
we have people above normal and what to do with them. Something 
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of the scope of the volume may be gained by a glance at the table of 
contents. Professor Kaseff first considers the Physiology of the Supe- 
rior. Here he cites numerous cases and findings from writers in 
America, Europe, and elsewhere, giving not only the opinions of 
educators, but also of other scientists including physicians. Although 
he assumes a friendly attitude toward the testing movement, and 
expects much from it, he cites contrary opinions and then weighs the 
evidence. His final decision is on the side of scientific measurements 
and applications, but he does not go all the way with the proponents of 
measurements in the United States. 

The second part of the volume is devoted to the psychology of the 
superior boys and girls. Professor Kaseff is not the advocate of any 
particular school of psychology, but presents the facts and lets them 
speak for themselves. This second part of the work is the longest and 
the most interesting section of the study. The third part deals with 
the pedagogy to be followed and applied in the case of the superior 
child. Included here is a consideration of recent sociological views on 
education, especially those that bear on the exceptional child. As a 
whole, the work is scholarly and sane. The single point. to lament is 
its flimsy binding. C. A. Baker. 

Normal School, Rio Baptist College. 
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{Mention in this section does not preclude a work from receiving extended notice in a latter issue.| 


GorDON W. ALLPorRT and Puiuip E. Vernon: Studies in Expressive Movement. 
New York: The Macmillan Company, 1933, pp. 269. 

F. AtvERDES: The Psychology of Animals. New York: Harcourt, Brace and 
Co., 1932, pp. 156. 

J. G. Brgse-Center: The Psychology of Pleasantness and Unpleasaniness. 
New York: D. Van Nostrand Co., Inc., 1932, pp. 427. 

W. Boven: Adam et Eve Ou la Question des Sexes. Paris: Delachaux & Niestle 
S. E., 1933, pp. 144 (paper). 

HerBerRT A. CARROLL: Prose Appreciation Test. Senior High School, pp. 16. 
Junior High School, pp. 12. Ezaminer’s Manual, pp. 16 (paper). Minneapolis: 
Educational Test Bureau, Inc., 1932. 

ALGERNON, CoLEMAN, Compiler: An Analytical Bibliography of Modern Lang- 
uage Teaching, 1927-1932. Chicago: University of Chicago Press, 1933, pp. 296. 

Miriam A. Compton: An evaluation of History Texts. Philadelphia: McKinley 
Publishing Co., 1932, pp. 53 (paper). 

K. S. Cunnincuam and Otuers: Australian Educational Studves. First 
Series. Melbourne Australia: Melbourne University Press, 1932, pp. 125 (paper). 

Tomas D. Cutsrortna: The Blind in School and Society. New York: D. 
Appleton and Co., 1933, pp. 263. 

Knigut DunuapP: Habits: Their Making and Unmaking. New York: Liveright, 
Inc., 1932, pp. 326. 

Heien E. Farrparrn: The Library Profession. Buffalo: The University of 
Buffalo, 1932, pp. 31 (paper). 

G. C. Freup: Prejudice and Impartiality. New York: Robert M. McBride & 
Co., pp. 116. 

ADELBERT Forp: The Story of Scientific Psychology. New York: Sears Pub- 
lishing Co., 1932, pp. 307. 

LutHer C. Giusert: An Experimental Investigation of Eye Movements in 
Learning to Spell Words. Psychological Review Monographs, Vol. XLIII; No. 3. 
Princeton, N. J.: Psychological Review Co., 1932, pp. 81 (paper). 

G. R. Gites and Joun R. Lyawu: Occupations in Victoria. Melbourne, 
Australia: Melbourne University Press, 1932, pp. 78 (paper). 

D. C. Grirritus: The Psychology of Literary Appreciation. Melbourne, 
Australia: Melbourne University Press, 1932, pp. 142 (paper). 

Nora M. Hautes: An Advanced Test of General Intelligence. Melbourne, 
Australia: The University Press, 1932, pp. 64 (paper). 

M. Estuer Harpine: The Way of All Women. New York: Longmans, Green 
& Co., 1933, pp. 335. 

Harry L. HoLuincwortu: Educational Psychology. New York: D. Appleton 
and Co., 1933, pp. 540. 

Grorce Humpurey: The Nature of Learning. New York: Harcourt, Brace 
and Co., 1933, pp. 296. 

Epwarp Sarrorp Jones: Comprehensive Examinations in American Colleges. 
New York: The Macmillan Co., 1933, pp. 436. 
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Horace M. Katuen: /ndividualism: An American Way of Life. New York: 
Liveright, Inc., 1933, pp. 241. 

ALINA M. LINDEGREN: Institutions of Higher Education in Sweden. Washing- 
ton, D. C.: United States Office of Education, 1932, pp. 45 (paper). 

M. Luss and Oruers: The Growing Child: A Series of Five Lectures on Child 
Management. Melbourne, Australia: Melbourne University Press, 1932, pp. 
72 (paper). 

C. A. Mace: The Psychology of Study. New York: Robert M. McBride & Co., 
pp. 96. 

Juuivus B. Mauer: Testing the Knowledge of Jewish History. Cincinnati: 
Department of Synagogue and School Extension of the Union of American Hebrew 
Congregations, 1932, pp. 252. 

MARRIAGE AND Divorce 1931: Tenth Annual Report, Bureau of the Census, 
U. S. Department of Commerce. Washington: Government Printing Office, 1932, 
pp. 76 (paper). 

A. Gorpon ME LvIn: Education for a New Era. New York: The John Day Co., 
Inc., 1933, pp. 30 (paper). 

JaMES L. MursELu: The Psychology of Secondary School Teaching. New York: 
W. W. Norton & Co., Inc., 1932, pp. 468. 

Rosert Morris OGpEN and FranxK S. FREEMAN: Psychology and Education. 
Revised Edition. New York: Harcourt, Brace and Co., 1932, pp. 350. 

H. T. Parxer: Defects of Speech in School Children. Melbourne, Australia: 
Melbourne University Press, 1932, pp. 60 (paper). 

Lorp Eustace Percy, Editor: The Year Book of Education, 1933. London: 
Evans Brothers Ltd., 1933, pp. 860. 

Witiarp C. Rappieye, Director: Final Report of the Commission on Medical 
Education. New York City, 630 West 168th Street: The Commission on Medical 
Education, 1932, pp. 560. 

Exsre M. Smituies: Case Studies of Normal Adolescent Girls. New York: D. 
Appleton and Co., 1933, pp. 284. 

Artuour L. Swirt, Editor: Religion Today: A Challenging Enigma. New York: 
McGraw-Hill Book Co., Inc., 1933, pp. 300. 

Hitpa Tapa: The Dynamics of Education. New York: Harcourt, Brace and 
Co., 1933, pp. 278. 

Epwin Burket TwiTMyER and YALE SAMUEL NATHANSON: Correction of 
Defective Speech. Philadelphia: P. Blakiston’s Son & Co., 1932, pp. 413. 

Haroup M. Wituiams, CLEMENT H. Srevers and ME vin S. Hattrwick: The 
Measurement of Musical Development. lowa City: State University of Iowa, 1933, 
pp. 191 (paper). 

Evita A. Wricut, Compiler: Bibliography of Research Studies in Education, 
1930-1931. Bulletin, 1932, No. 16, United States Office of Education. Washington: 
Government Printing Office, 1932, pp. 457 (paper). 
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